JIPF (JURNAL ILMU PENDIDIKAN FISIKA) p-ISSN: 2477-5959 | e-ISSN: 2477-8451 Vol. 10 No. May 2025. Page 194-207 This work is licensed under a Creative Commons Attribution-NonCommercial 4. 0 International License. Development of Physics HOTS Assessment Instruments in High School: A Comprehensive Literature Review Yoga Budi Bhakti 1*). Riyan Arthur 2. Yetti Supriyati 3 Universitas Indraprasta PGRI. Indonesia1. Universitas Negeri Jakarta. Indonesia2,3 Corresponding E-mail: bhaktiyoga. budi@gmail. Received: August 30th, 2024. Revised: October 18th, 2024. Accepted: November 6th, 2024 Keywords : Higher Order Thinking Skills. Physics Instrument. Assessment Instrument. Literature Review ABSTRACT This study aims to identify strategies for developing HOTS assessment instruments in high school physics subjects. The research employs bibliometric methods and qualitative approach to analyze various journals published between 2019 and 2023. The analysis results indicate that the development of HOTS instruments is driven by the need for 21st-century skills and is supported by findings from international surveys such as PISA and TIMSS. The most commonly developed assessment forms include test questions in the form of multiple choice, multiple choice with reasoning, and essays. These questions are designed with HOTS characteristics in mind, including HOTS indicators, operational verbs, physics problem contexts, stimuli, and Bloom's taxonomy. Physics topics that are frequently developed include subjects such as temperature and heat, harmonic vibrations, sound and light waves, thermodynamics, as well as static and dynamic fluids. Commonly used research methods for this development include Borg and Gall's R&D method, the 4D method, qualitative descriptive methods, and the ADDIE method. HOTS instrument undergoes a series of feasibility tests and analyses, including validity testing by validators, statistical validity tests, reliability tests, difficulty level tests, discrimination index tests. Rasch model analysis, analysis using the Alpha Cronbach formula, and distractor The findings of this study provide valuable insights for developing valid, reliable, and relevant HOTS assessment instruments that meet the needs of high school physics education. INTRODUCTION The low level of Indonesian students' thinking skills is reflected in the latest results from the Programme for International Student Assessment (PISA), which show that Indonesian students still lag behind those from member countries of the Organisation for Economic Co-operation and Development (OECD) . The average scores of Indonesian students in reading, mathematics, and JIPF. Vol. 10 No. May 2025 science are significantly below the OECD average, indicating a lack of critical thinking, analytical abilities, and problem-solving skills. Although there has been some improvement in recent years, a significant gap remains, highlighting the need for educational reforms to enhance the quality of learning and students' understanding. The low cognitive abilities of students in learning physics, based on observations, can be seen from the 2023 National Assessment data, where 68% of students scored below the minimum competency standard in physics. This has led to difficulties in understanding and applying basic physics concepts, as also reflected in the 2018 PISA assessment results, where Indonesia ranked 71st out of 78 countries in the science category, which includes physics. The main factor influencing this low ability is a learning approach that still focuses on memorization rather than deep conceptual understanding . The lack of student engagement in practical experiments and real-world problem analysis also limits their ability to apply physics theories to everyday contexts. As a result, many students only have a superficial understanding of physics without the ability to think critically and analytically when solving more complex problems . Understanding concepts in learning physics cannot be underestimated, as physics is one of the foundational sciences behind much of the technology and innovations we use in everyday life. Mastery of physics concepts requires higher-order thinking skills . , such as the ability to analyze situations, formulate hypotheses, and evaluate experimental results. With these skills, students can not only understand the subject matter better but also develop essential problem-solving abilities applicable in various fields, including engineering, medicine, and scientific research. Therefore, physics education should focus on developing critical thinking skills and deep conceptual understanding to prepare students for future challenges. Higher Order Thinking Skills (HOTS) in physics involve students' ability to analyze, evaluate, and create solutions based on the concepts they have learned . HOTS goes beyond merely recalling facts or formulas. it requires students to deeply understand physics concepts and apply them in new and complex situations. The main components of HOTS in physics include analysis, where students must break down a physics problem into smaller parts and understand the relationships between its components. evaluation, which involves critical assessment of methods or experimental and creation, where students are challenged to develop new solutions or design experiments to solve physics problems they have not encountered before . By focusing on HOTS, physics learning becomes more dynamic and relevant, helping students not only master the material but also think creatively and innovatively in applying physics knowledge to real-world scenarios . Higher Order Thinking Skills (HOTS) are closely related to Bloom's Taxonomy, revised by Lorin Anderson and David Krathwohl, which classifies cognitive skills in education. This revision updated Benjamin Bloom's original taxonomy by introducing six cognitive levels: Remembering. Understanding. Applying. Analyzing. Evaluating, and Creating . HOTS refers to the top three levels in this taxonomyAiAnalyzing. Evaluating, and CreatingAiwhere students are required not only to memorize or understand information but also to think critically and creatively . At the Analyzing stage, students break information into smaller parts and understand the relationships between them. At the Evaluating stage, they make judgments and decisions based on evidence or specific criteria. Finally, at the Creating stage, students use their knowledge to generate new ideas or Thus. HOTS encourages students to progress from basic understanding to more complex and innovative thinking . , aligning with the principles of the revised Bloom's Taxonomy. The accuracy of assessment methods in evaluating Higher Order Thinking Skills (HOTS) in physics is crucial to ensuring that students' critical and creative thinking skills are accurately measured . Proper assessment should go beyond multiple-choice questions or questions that only test Instead. HOTS assessments need to include tasks that encourage students to analyze data, evaluate experimental results, and create new solutions or models . For instance, using essay questions, case studies, or research projects allows students to demonstrate deep understanding p-ISSN: 2477-5959 | e-ISSN: 2477-8451 Development of Physics HOTS Assessment Instruments in High School: A Comprehensive Literature Review Yoga Budi Bhakti. Riyan Arthur. Yetti Supriyati and their ability to apply physics concepts in complex situations. Clear and structured assessment rubrics are also important to ensure that skills such as analysis, evaluation, and creation are assessed By using appropriate assessment methods, teachers can provide meaningful feedback and encourage students to continue developing their higher-order thinking skills. HOTS assessment is designed to measure students' critical and creative thinking abilities by presenting questions at higher cognitive levels. This assessment aims to encourage students to analyze, evaluate, and create solutions to more complex problems. HOTS assessment is based on three main principles: providing stimuli, which can be text or other forms, to provoke deep thinking, . presenting new problems that have not been discussed in class, challenging students to apply their knowledge in different contexts, and . varying questions with different levels of difficulty and cognitive levels to test various aspects of students' thinking abilities. In formulating HOTS question indicators, operational verbs (KKO) based on the revised Bloom's Taxonomy are used to ensure that the knowledge dimensions measured by each question align with the desired learning objectives . The use of HOTS questions has great potential to help students understand the material presented more deeply and comprehensively . In the context of assessment. HOTS questions not only measure memorization ability but also assess more complex skills such as: . understanding relationships between concepts, . deeper integration and processing of information, . the ability to find connections between obtained information, . application of information in problem-solving processes, and . the ability to generate new ideas from available information. Thus, the use of HOTS assessment instruments serves as an effective tool not only to enhance students' understanding of the material but also as an evaluation tool for teachers to assess the effectiveness of the teaching This, in turn, can help improve the quality of education and encourage students to think more critically and creatively. The purpose of this study is to identify and analyze various important aspects in the development of (HOTS) assessment instruments in high school physics subjects. The study aims to explain the reasons for developing these instruments, the most effective assessment types, and the appropriate instrument forms to use in the context of physics learning. Additionally, the study focuses on identifying question-making indicators, selecting relevant physics materials, and research methods applied to ensure the validity and reliability of the assessment instruments. This research also aims to outline how to determine academically and practically accountable results from the development of HOTS assessment instruments in physics. The findings of this study are expected to provide deep insights into the key points that need to be considered in the process of developing HOTS assessment instruments, thereby serving as a guide for educators in enhancing the quality of physics assessment and learning in high schools. METHOD This research adopted a literature review approach based on the analysis of 45 journals published between 2019 and 2023. The methods used included bibliometric methods and qualitative approaches. Bibliometrics is a method of literature analysis that uses mathematical and statistical approaches. The qualitative approach aimed to generate descriptive data in the form of written words obtained from the analysis results. The primary data for this study was collected in October 2023 using the Publish or Perish application to gather metadata from the Google Scholar database, with publication year limits set from 2019 to 2023 and the keyword "Development of HOTS Assessment Instruments in High School Physics. " The initial screening resulted in 1,032 data entries. These data were then extracted into Microsoft Excel and sorted according to the research topics and objectives. Further selection was based on the following criteria: . must be an article, . not a proceeding, . published in an ISSN journal, and . available in PDF format. After the second selection process, 57 articles were obtained. These articles were then imported into the Mendeley application for further review and to complete data attributes such as the author's name, number, volume, year, and journal abstract, before being DOI: 10. 26737/jipf. JIPF. Vol. 10 No. May 2025 extracted into . ris format files. The bibliometric method was used to study the literature using a statistical approach. Data converted ris format were analyzed using the VOSviewer application, focusing on the relationships between the titles and abstracts of the articles. From this analysis, relevant words were selected for the research, such as reasons for development, types of assessment, characteristics of HOTS, research methods, and forms of analysis used. The mapping results using VOSviewer produced networks showing relationships between data attributes and formed several clusters. Words in these clusters were identified and selected according to the research objectives. The visualization of the VOSviewer mapping results was displayed as a reference for analyzing and reviewing all articles. Subsequently, a descriptive narrative was compiled about these words in sequence, based on the research objectives. RESULTS AND DISCUSSIONS The interconnection between articles, evaluated based on their titles and abstracts, has the potential to provide broader and more in-depth information about the research topic. By analyzing these connections, we can understand the scope of references used in each article, allowing for a more structured and effective clustering. In this regard. VOSviewer plays a crucial role by displaying frequently occurring words from the titles and abstracts of articles, which then form a network of interrelationships . This network illustrates thematic relationships between articles, where closely related words form specific clusters . Each article that shares common keywords with others is grouped into the same cluster, highlighting the close relationship between these articles in terms of research focus and issues. Figure 1 is a visualization of the results of this analysis, showing how these words form a network, indicating specific clusters or related themes, thus facilitating researchers in identifying and further exploring relevant study fields or research areas. Fig 1. VOS Viewer Visualization of Word Relationships From the analysis conducted, it was found that 90 items or words were identified as relevant within the context of this study. Subsequently, these were sorted to determine the keywords that most accurately represent the core of the research, resulting in the formation of six main clusters. Each of these clusters consists of words that are interrelated based on specific themes or topics. According to the visual mapping presented in Figure 1, the keywords within each cluster are clearly organized, demonstrating patterns of relationships and distinct themes. The keywords included in each cluster are then detailed in Table 1, which provides a clearer overview of the distribution and characteristics of the themes identified in the study. Thus, the table and visualization help to understand how the keywords interact and how the articles in this study are grouped based on identified thematic similarities. p-ISSN: 2477-5959 | e-ISSN: 2477-8451 Development of Physics HOTS Assessment Instruments in High School: A Comprehensive Literature Review Yoga Budi Bhakti. Riyan Arthur. Yetti Supriyati Cluster Cluster 1 Cluster 2 Cluster 3 Cluster 4 Cluster 5 Cluster 6 Tabel 1. Cluster in Research on the Development of HOTS Instruments Words in a Cluster ADDIE, assessment instruments. Borg & Gall model, student HOTS, development research, 4D model ADDIE development model. Borg & Gall, development model, physics, physics learning, research instrument Borg & Gall, evaluation, physics education. R&D (Research and Developmen. HOTS Higher order, science, senior high school student, multiple choice test, instrument HOTS assessment instrument. HOTS test. Rasch model, reliability HOTS-based. HOTS questions, development, higher order thinking skills Based on the analyzed interconnections, two prominent density visualizations, "HOTS" and "Physics," reveal strong relationships between these terms, as shown in Figure 2. The density visualization highlights the connections between "HOTS," "Physics," and "Higher Order Thinking," illustrating the extent to which these topics are interrelated within the scientific literature. In this visualization, the density of colors reflects the frequency and intensity of the occurrence of related terms. Areas with brighter colors indicate a higher concentration of research, suggesting that these topics are a primary focus in various studies. Fig 2. VOS Viewer Display in Density Visualization In this context, "HOTS" and "Physics" are likely strongly connected, indicating that much research focuses on developing assessment instruments that emphasize higher-order thinking skills in high school physics education. This underscores the importance of integrating HOTS into the physics curriculum to enhance students' conceptual understanding and analytical abilities. This connection also reflects significant research trends in physics education, which aim to evaluate and develop students' critical thinking skills through more sophisticated and relevant assessment instruments. Fig 3. VOS Viewer Display of the Relationship between "HOTS" and "Physics" DOI: 10. 26737/jipf. JIPF. Vol. 10 No. May 2025 When the term "HOTS" is highlighted, it is closely related to several terms that explain that "HOTS" is associated with research on the development of HOTS assessment instruments, predominantly using development models from Borg & Gall, the 4-D Model, and the ADDIE Model, as shown in Figure 3. In Figure 3. , when the term "Physics" is highlighted, several terms related to "HOTS" and "Research Instrument" are further expanded with a broader range. Based on the analysis using VOSviewer, the terms "Physics," "HOTS," and "Research Instrument" are interconnected in a complex network, indicating a significant relationship between these concepts in the research The term "HOTS" is linked not only to assessment instruments but also to HOTS-related issues and physics lesson materials, reflecting how higher-order thinking skills are integrated into various aspects of physics teaching . This range of interconnections shows that developing research instruments in physics encompasses a broad evaluation of HOTS capabilities, from how these instruments are designed to assess students' critical thinking skills to how HOTS issues and physics lesson materials are designed to support the development of these skills. These findings highlight the importance of a comprehensive approach in physics education, which includes developing effective assessment instruments. HOTS-based problem-solving, and designing lesson materials that support higher-order thinking skills. The development of higher-order thinking skills questions in physics is a strategic step, strongly justified from pedagogical perspectives and the needs of modern education . First, physics as a discipline demands deep understanding and strong analytical abilities. Physics material is often complex, requiring students not only to grasp basic concepts but also to apply, analyze, and evaluate information to solve abstract and contextual problems . HOTS questions are designed to push students beyond mere memorization of facts toward higher-level abilities . , such as synthesizing information and applying concepts in new situations. Second, in the era of globalization and the Fourth Industrial Revolution, critical and creative thinking skills are highly needed. Developing HOTS questions in physics helps prepare students to meet these challenges by equipping them with tools to think independently, identify problems, and seek innovative solutions . Through these questions, students are encouraged to think deeply and critically, connect various concepts, and formulate and test hypotheses. Moreover, the application of HOTS in evaluating physics learning also contributes to improving the overall quality of education. This is because HOTS questions measure not only factual knowledge but also the thought processes and comprehensive understanding of concepts. Therefore, developing HOTS questions is crucial in supporting higher educational goals, namely producing graduates who are not only academically smart but also competent in solving real-world problems and ready to face future challenges. The use of various types of tests to measure HOTS in physics . aims to evaluate students' abilities to analyze, apply, evaluate, and create solutions to complex Multiple-choice questions can be specifically designed to challenge students to select the most appropriate answer from several seemingly correct options, requiring critical analysis . Reasoned multiple-choice questions go further by asking students to explain the reasoning behind their choices, promoting reflection and deeper understanding . Essay questions allow students to elaborate on their thoughts in more detail, enabling the testing of their ability to organize arguments and connect various physics concepts . Problem-based tests and case studies place students in real-world situations that require the application of physics knowledge to solve problems, assessing their analytical and synthetic abilities. Meanwhile, project-based assessments require students to design and execute projects relevant to physics concepts, testing their ability to apply theory in a practical and creative manner. The combination of these various test forms provides a comprehensive overview of students' higher-order thinking skills in physics, ensuring that the assessment covers various important aspects of HOTS skills. Assessment instruments to measure higher order thinking skills have been developed with various forms of tests. The details are as follows table 3: p-ISSN: 2477-5959 | e-ISSN: 2477-8451 Development of Physics HOTS Assessment Instruments in High School: A Comprehensive Literature Review Yoga Budi Bhakti. Riyan Arthur. Yetti Supriyati Table 3. Higher Order Thinking Skills Assessment Instruments Form Author Analysis Result Illene et al . Suprapto et al . Complex Multiple Choice Maryani et al . Istiyono et al . Erfianti et al . Tier Diagnostic Test Rintayati et al . Maryani et al . Essay Merta Dhewa et al . Sumarni . Open-Ended Romli et al . Based on Table 3, in physics lessons, the most commonly used test formats to measure higher-order thinking skills (HOTS) are multiple-choice, reasoned multiple-choice, and essay tests. Multiple-choice tests are used because they allow for quick and objective evaluation of students' understanding of basic physics concepts. Reasoned multiple-choice tests, on the other hand, require students not only to choose the correct answer but also to provide reasons or explanations for their choices, thereby assessing their ability to understand and apply physics concepts more deeply. Meanwhile, essays provide students with the opportunity to present arguments, analyses, and solutions to complex physics problems in writing, allowing for the assessment of critical thinking, creativity, and a broader understanding of concepts. The combination of these three types of tests provides a more comprehensive picture of students' mastery of physics material and their higher-order thinking skills. HOTS questions are designed not only to assess students' ability to recall information but also to evaluate their ability to analyze and create. These questions aim to encourage students to transfer knowledge across concepts, integrate information, connect various problems, solve them, and understand information critically. Bloom's Taxonomy, refined by Anderson and Krathwohl, is one of the main references in the development of HOTS questions, dividing thinking skills into two levels: LOTS (Lower Order Thinking Skill. and HOTS. In this taxonomy, higher-order thinking skills encompass the levels of analyzing, evaluating, and creating, allowing for the assessment of students' ability not only to understand concepts but also to solve problems and think critically. Table 4. Criteria for HOTS-Type Questions Criteria for HOTS-Type Questions Author Measuring indicators of higher-order thinking skills Ramadhan et al . Merta Dhewa et al . Using appropriate operational verbs Jauhariyah et al . Suprapto et al . Items have contextual stimuli Sunarti & Jauhariyah . Zaki et al . Using Bloom's Taxonomy stages Fanani et al . Effective HOTS questions indicators that measure higher-order thinking skills are those that encompass the abilities to analyze, evaluate, and create. The use of operational verbs in formulating HOTS question indicators is crucial for determining the cognitive level being assessed. Not all operational verbs at levels C1 to C3 are considered HOTS. for example, if a question uses a KKO that appears to be at level C2 but requires deeper analysis, it is still classified as a HOTS question. developing HOTS questions, particularly for physics subjects, several aspects need to be considered: assessment principles, including the use of stimuli, the presentation of new problems, and the differentiation of question difficulty levels, . the skills being measured, such as the ability to connect concepts, process information, and critically seek and use information, . based on Bloom's Taxonomy dimensions, where HOTS questions focus on categories C4 . C5 . , and C6 . , and . the appropriate selection of operational verbs in creating question By paying attention to these aspects. HOTS question development can be more effective in assessing students' higher-order thinking skills. DOI: 10. 26737/jipf. JIPF. Vol. 10 No. May 2025 HOTS test questions are developed not only to measure students' abilities but also to provide a deeper understanding of their knowledge of the material taught . Therefore, the selection of material in the development of HOTS questions becomes very important. With focused material, the process of analyzing the results of question trials becomes easier and more effective. Table 5. Physics Materials Related to HOTS Physics Materials Related to HOTS Temperature and Heat Harmonic Vibrations Sound and Light Waves Thermodynamics Static Fluids Dynamic Fluids Author Fanani et al . Rifa et al . Istikomah et al . Akhsan et al . Astra et al . Jauhariyah et al . Zahra & Suwarna . Agustihana & Suparno . Saefullah et al . Linuwih & Safutra . Akhsan et al . Serevina et al . Pratiwi et al . Sari & Suyatna . Qotrunnada & Prahani . Al Fath & Dewi . In the development of HOTS instruments, the materials commonly used include temperature and heat, vibrations, sound and light waves, static fluids, dynamic fluids, and thermodynamics. The appropriate selection of material is crucial for measuring higher-order thinking skills in a more structured way. Additionally, this allows for a more accurate evaluation of students' understanding of the specific topics being tested. This approach aligns with the need to integrate content knowledge in each subject and education level, aiming for more effective and meaningful learning. The development of HOTS instruments in physics lessons, several development methods and models are commonly used, including the Research and Development (R&D) model, the ADDIE development model (Analysis. Design. Development. Implementation. Evaluatio. , the 4-D development model (Define. Design. Develop. Disseminat. , and the Borg & Gall model . The R&D model is used to produce valid and effective educational products through a series of systematic research and development stages. The ADDIE model helps in designing and implementing instruments that meet students' needs through a process of needs analysis, instrument design, question development, implementation, and result evaluation. The 4-D model focuses on defining concepts, designing questions, developing question prototypes, and disseminating instruments for trial. Meanwhile, the Borg & Gall model, as one of the R&D models, emphasizes preliminary research, planning, initial product development, field trials, revisions, and dissemination to ensure the effectiveness and efficiency of the instrument. All these models aim to produce highquality HOTS instruments capable of measuring students' higher-order thinking skills in the context of physics learning. The process of testing and analyzing the feasibility of HOTS instruments in physics lessons involves various stages to ensure that the instrument can accurately and reliably measure higher-order thinking First, validity testing is conducted by validators, who are experts in education and physics, to assess the content, construct, and language alignment of the instrument . Next, the instrument undergoes characteristic testing to determine whether the questions formulated reflect higher-order thinking skills. Empirical validity testing is carried out to ensure that the instrument measures what it is supposed to measure, while reliability testing, using Alpha Cronbach's formula, assesses the internal consistency of the instrument . Difficulty level testing is conducted to determine whether the questions have appropriate difficulty variation, and discrimination testing aims to measure how well each question can differentiate between students with high and low understanding. This process also includes Rasch model analysis to evaluate how well the data fits the expected measurement model, as well as partial credit model analysis for questions with multiple correct answer Distractor analysis is conducted to assess the effectiveness of incorrect answer choices in multiple-choice questions . , and readability testing ensures that the language used in the questions is understandable by students . Variance analysis is used to assess performance differences between groups, while general criteria analysis ensures that the instrument meets the expected quality standards. Finally, classical test theory analysis is used to evaluate the quality of p-ISSN: 2477-5959 | e-ISSN: 2477-8451 Development of Physics HOTS Assessment Instruments in High School: A Comprehensive Literature Review Yoga Budi Bhakti. Riyan Arthur. Yetti Supriyati questions based on the traditional test theory approach . All these stages aim to ensure that the HOTS instrument in physics is a valid, reliable, and effective evaluation tool for measuring students' higher-order thinking skills. CONCLUSION AND SUGGESTION The development of Higher Order Thinking Skills (HOTS) assessment instruments in high school physics is essential for fostering students' abilities to analyze, evaluate, and solve complex problems. As highlighted in this literature review, the global push for 21st-century skills, reinforced by international assessments like PISA and TIMSS, has driven the creation of these instruments, which often take the form of multiple-choice questions, reasoned multiple-choice questions, and essays that incorporate HOTS indicators, operational verbs, and contextual physics problems aligned with BloomAos Taxonomy. Frequently assessed topics include thermodynamics, waves, and fluid dynamics, with instruments developed through models such as Borg and GallAos R&D, the 4D method, and the ADDIE model. Rigorous testing for validity, reliability, difficulty, and discrimination ensures these instruments are effective in measuring higher-order thinking. The implications for education are significant, as these tools offer more comprehensive assessments, driving the need for professional development for educators and urging policymakers to integrate HOTS assessments into the Further research and technological integration are recommended to improve the implementation and effectiveness of these instruments, ultimately fostering a more equitable and engaging physics education for students. ACKNOWLEDGMENTS