Integrated Science Education Journal Vol. No. September 2025, pp. ISSN: 2716-3725. DOI: 10. 37251/isej. Measuring Energy Literacy: Validation of Knowledge. Attitude, and Behavior Instruments Using the Rasch Model Inda Hasanah Bimastari1. Muhamad Yusup1,*. Kistiono1 1Master Program of Physics Education. Universitas Sriwijaya. Sumatera Selatan. Indonesia Article Info ABSTRACT Article history: Purpose of the study: This study aims to develop and validate a Rasch Modelbased energy knowledge, attitude, and behavior assessment instrument to measure energy literacy among high school students. Received Jul 17, 2025 Revised Jul 27, 2025 Accepted Jul 30, 2025 OnlineFirst Aug 08, 2025 Keywords: Attitude Behavior Energy literacy Knowledge Rasch model Methodology: The research design followed the stages of theory-based instrument development and the Rasch modeling approach. The quantitative approach and instrument development process were carried out through the stages of item preparation based on literature review and theoretical indicators, validation by experts, and field testing of 50 participants. Main Findings: The knowledge instrument has high quality, obtained as much as 9 data showing a fit for the Rasch model, such as MNSQ infit and outfit values ranging from 0. 84 to 1. and positive point measure correlations. Instrument reliability shows infit and outfit results that are still within the ideal tolerance range, obtained sequentially the person reliability values of knowledge, attitude, and behavior of 0. 50, 0. 76 and 0. So that the instrument is suitable for accurate and consistent measurement. Novelty/Originality of this study: this study provides innovation in the energy literacy measurement literature . nowledge, attitude, and behavio. with an indepth evaluation of item performance and respondent ability distribution using empirical and applicable methods for the development of evidence-based and Rasch model-based assessment instruments. This is an open access article under the CC BY license Corresponding Author: Muhamad Yusup. Master Program of Physics Education. Universitas Sriwijaya. Padang Selasa Road. No. Palembang. Sumatera Selatan, 30139. Indonesia Email: rohyanafelisa@students. INTRODUCTION Energy-related global challenges, such as climate change and the need for energy transition Sustainable energy issues, further highlight the urgency of energy literacy. Individuals are required to have cognitive and demonstrate responsible behavior regarding energy consumption . Energy literacy has a multidimensional construct that includes cognitive, affective, and behavior towards energy . The knowledge dimension relates to cognitive understanding of energy concepts, energy efficiency, renewable and non-renewable resources, and the impact of energy use on the environment, and the attitude dimension reflects an individual's views or values on energy and environmental issues . , . The behavioral dimension reflects a person's tendencies and habits in using energy efficiently and responsibly . Meanwhile, research results show that the level of energy literacy of students, especially knowledge, attitudes, and behavior, is still in the low category . Increasing energy awareness and literacy among the public is considered a key pillar in the transition to a more efficient and environmentally friendly energy system . , . Journal homepage: http://cahaya-ic. com/index. php/ISEJ A ISSN: 2716-3725 Low energy literacy is not only a problem in Indonesia, but also a global concern that affects efforts toward sustainable energy transition . Several studies indicate that secondary school curricula have not fully integrated energy competencies in an explicit and balanced manner, leaving students with limited understanding of energy issues . , . Low critical and analytical thinking skills in the context of energy pose a challenge in fostering sustainable behavior . However, energy literacy has been proven to have a strong correlation with pro-environmental attitudes and energy-saving habits . , . Therefore, an instrument capable of measuring the integration of cognitive, affective, and behavioral dimensions is essential as a basis for evaluating energy education programs . The Rasch model offers high psychometric validity in assessing participants' abilities and responses in an invariant and objective manner . , . Rasch-based instruments can be flexibly developed for various student backgrounds, ensuring measurement accuracy and freedom from contextual bias . Therefore, it is important to develop instruments that are not only statistically valid and reliable but also relevant to the current energy education context. Various initiatives and educational programs are promoted to foster energy literacy among the public, starting from the primary education level to the general public. Energy literacy, which includes knowledge, attitudes and behaviors related to energy use, is considered an important pillar in realizing a just and sustainable energy transition . However, to measure the effectiveness of these programs, valid and reliable measurement instruments are needed. The Rasch model, one of the approaches in Item Response Theory (IRT), has been the choice for generating interval scales and ensuring the invariant nature of measurement policies. With Rasch, items are thoroughly evaluated based on item difficulty and person ability, and tested for the extent to which the data fulfill the expected probabilistic structure. As a more sophisticated alternative. Item Response Theory (IRT) has emerged as a more robust framework for psychometric analysis. One of the most frequently used IRT models that has the advantage of easy interpretation is the Rasch model . The Rasch model offers a different approach from CTT by modeling the probability of individual responses to items based on the interaction between respondent ability and item difficulty. The advantage of the Rasch Model lies in its ability to identify misfit items, estimate item difficulty, and measure respondent ability on an interval scale, which classical test theory cannot do directly . , . The application of the Rasch Model has been widely used in various fields, including education, psychology, and health, to ensure the quality of the instruments used . The application of the Rasch Model in various fields of learning has shown encouraging results. For example, . successfully designed and validated an energy literacy instrument for prospective physics teachers using a four-tier approach and Rasch analysis, showing that the instrument was feasible and able to measure energy literacy accurately. addition, the digital literacy study developed a digital literacy measurement tool with high validity and reliability through Rasch verification, ensuring the data is unbiased and meets the assumptions of a single-dimensional structure . The Rasch model not only produces consistent measures, but is also able to detect items that do not converge ("misfit") through statistical analysis, as well as validate the equality of item functions . between subgroups. A valid measurement instrument is one that actually measures what it is supposed to This method is used to generate construct validity . hrough unidimensionality analysis, infit/outfit. PTMEA) and adequate reliability . Therefore, it is important to conduct an in-depth analysis of the instruments used to measure energy knowledge and behavior to ensure the validity and quality of the data. Instrument validation is an important process to ensure that the instrument measures the intended aspects precisely and consistently . Statistical validation methods play a major role in this context, and the Rasch Model approach is increasingly being used in assessment instrument development due to its ability to determine item fit and measure respondent characteristics simultaneously . Recent research shows that the use of Rasch models is able to improve measurement accuracy, identify problematic items, and ensure invariance of item functions across different cultural and demographic populations . In addition, validation through statistical analysis such as fit statistics, unidimensionality, and item bias analysis are key things that must be done to ensure the instrument can be used widely and sustainably . However, many instruments in circulation still rely on traditional approaches such as classical test theory (CT) item analysis, which has limitations, especially in terms of the mismatch between item characteristics and respondent abilities . Research shows the potential of the Rasch model in analyzing instruments in the field of education. For example, in a validation study of science literacy instruments, the Rasch Model was shown to be able to detect biased items and provide more stable parameter estimates than CTT . Similarly, in the context of energy literacy, the application of the Rasch model can provide a more accurate picture of the level of respondents' energy knowledge and attitudes, as well as identify problematic items . Based on this urgency, this study aims to comprehensively analyze the validation of energy knowledge, attitude, and behavior question instruments using the Rasch Model. This analysis will include testing the fit of the data with the model . it analysi. , unidimensionality, and identification of misfit items. Thus, the results of this study are expected to provide strong empirical evidence of the quality of the instrument used, as well as provide recommendations for instrument improvement to more accurately measure energy literacy competencies. In. Sci. Ed. Vol. No. September 2025: 153 - 162 In. Sci. Ed. ISSN: 2716-3725 RESEARCH METHOD This instrument was developed through several stages starting from developing an assessment framework: analyzing the competencies and components of energy literacy and organizing them. The items developed followed the framework of . The research design followed the stages of theory-based instrument development and the Rasch modeling approach, which is known to produce interval-scale measurements and maintain the principle of invariance in measurement . The instrument development procedure was carried out through several stages, starting from the preparation of question items based on literature review and theoretical indicators, which were then validated by experts . xpert judgmen. to ensure content validity. The revised question items were then tested on an initial sample of 50 high school students using purposive sampling technique, in order to obtain a diversity of responses and backgrounds. The results of data collection were analyzed using the Rasch Model with the help of software such as Winsteps to evaluate various aspects of item quality. The analysis included item fit statistics, consisting of infit and outfit Mean Square (MnS. values with ideal criteria ranging from 0. 5 to 1. values over 2. 0 indicate the presence of item misfit . In addition, item-total correlations (PTMEA Cor. were also analyzed, with a minimum cutoff of 0. 20 for an item to be considered as having a good contribution to the construct being measured To ensure that the instrument measures one consistent construct . Principal Component Analysis (PCA) analysis is carried out on the residuals, with the criteria that at least 40-50% of the variance is explained by the main dimensions . The quality of the instrument is also determined through reliability analysis and separation index. good instrument is expected to have person and item reliabilities above 0. 80, as well as a stratum value of at least 0, which indicates the instrument's ability to distinguish student ability levels well Visualization of the distribution of respondent ability and item difficulty is displayed through the Wright map (Item-Person Ma. , to see whether the items are evenly distributed across student ability levels . After all analyses were conducted, items that were inappropriate or showed bias were revised or deleted. The revised instrument is then retested on a larger sample group . ield test II) to confirm final validity and reliability. The final result is an instrument that is valid, reliable, unidimensional, bias-free, and ready to be used for research and learning evaluation. This approach has been widely used in the validation of science and energy literacy instruments. RESULTS AND DISCUSSION The assessment instrument consists of 10 knowledge items, 10 attitude items, and 9 behavior items. The average person size is. Evidence of validity argument 1 Knowledge instrument items It shows that most items fall within the ideal MNSQ range, which is between 0. 84 and 1. This range indicates that most items have a good level of fit to the model. However, there is one item, item number 4, which has an MNSQ outfit value of 2. 13, which exceeds the upper tolerance limit . This indicates that the item shows a mismatch to the model and most likely contains noise or measurement inaccuracy, which could be due to student guessing or the mismatch of the item context to the ability being measured. Furthermore, the ZSTD values for almost all items were within A2. 0, indicating that none of the items statistically deviated too far from the model expectations, although the ZSTD value for item 4 was also high . , corroborating the previous finding that this item needed to be re-examined. In addition, the point measure correlation values for all items are positive, which means that all items make a positive contribution to the measurement of student ability. However, it should be noted that the correlation value of item 4 is only 0. which is lower than the other items which are generally above 0. This low correlation suggests that students' responses to item 4 are not consistent with their general ability. Measuring Energy Literacy: Validation of Knowledge. Attitude, and Behavior A (Inda Hasanah Bimastar. A ISSN: 2716-3725 Table 1. Misfit order of knowledge Overall, eight out of ten items were classified as excellent and valid, namely items 1, 3, 5, 6, 7, 8, 9, and 10, as they met the fit criteria and adequate correlation. One item . is recommended for revision as it shows significant misfit, while item 2 can still be used but needs further monitoring as its correlation value is at the lower limit. Thus, this instrument generally has good validity, but there is a need for improvement or refinement on one item in order to measure students' abilities more accurately and representatively. Attitude instrument item Table 2. Misfit order of attitude Based on the table, the average measure value of the items is 0. 00 logit, indicating that the difficulty level of the items is around the average ability of the respondents. However, there are significant findings in the fit statistics values, especially in the infit and outfit mean square (MNSQ). The average infit MNSQ of 1. 01 and outfit MNSQ of 1. 14 is still within the tolerance range . , but there are items that deviate greatly, namely Item10 with an infit MNSQ of 2. 95 and outfit MNSQ of 4. 32, accompanied by high ZSTD values . 82 and This indicates that participants' responses to this item were highly at odds with the model predictions, which could be due to guessing, ambiguity in the statement, or unclear context of the item. The point measure correlation value (PTMEASUR-AL) for Item10 is also very low, only 0. indicating that this item barely contributes to measuring the intended attitude construct. In contrast, other items such as Item9. Item6, and Item7 show high correlations . , as well as more ideal fit values, indicating that they consistently measure aspects of attitude according to the Rasch model. Items that have low correlation and high deviant fit such as Item10 need to be evaluated and revised immediately. Almost all items were categorized as good. However, the presence of one highly misfit item (Item. needs serious attention. According to . items like this risk lowering the overall validity of the instrument and In. Sci. Ed. Vol. No. September 2025: 153 - 162 In. Sci. Ed. ISSN: 2716-3725 should be considered for deletion or substantial revision. This item may have a language structure or context that is incompatible with respondents' understanding, thus not measuring the attitude construct validly. Behavioral instrument item Table 3. Misfit order of behavior Based on the results of the analysis using the Rasch Model on 9 behavioral items with 50 respondents, it was found that there were several items that had fit statistics values outside the ideal limits. The average infit mean square (MNSQ) value is 1. 01 and the outfit MNSQ is 1. Although on average it is still within the tolerance range of 0. 5, there are some items that deviate quite significantly, especially in the MNSQ outfit The first item (Item. shows an outfit MNSQ value of 5. 97 and a ZSTD of 4. 61, which indicates an overfit or response pattern that is very inconsistent with the model predictions. Item2 and Item4 also showed outfit values above 2. 0, indicating the possibility that the items were not working as intended in measuring student behavior, which could be due to guessing, miscoding, or ambiguous item interpretation. Overall, the behavioral question instrument has good measurement quality based on the Rasch Model. However, there are some items that show high statistical misfit and potentially interfere with the overall validity of the instrument. Therefore, revision and revalidation of these items are needed, especially those with outfit MNSQ > 2. 0 and low correlation, in order to improve the accuracy and suitability of the instrument in measuring student behavior effectively and objectively. Evidence for reliability argument 1 Reliability of knowledge instruments Table 4. Summary statistics of knowledge Measuring Energy Literacy: Validation of Knowledge. Attitude, and Behavior A (Inda Hasanah Bimastar. A ISSN: 2716-3725 The person reliability value based on real data of 0. 50 indicates that the instrument's ability to distinguish between high and low ability students is limited. The separation index value of only 1. 12 further emphasizes that this instrument is only able to group students into about two ability strata. This finding is in line with the opinion of . which states that a separation index below 2. 0 indicates the limitations of the instrument in detecting variations in the ability of participants, and indicates a mismatch between the items and the construct being measured, so it is necessary to review the content and construct validity of the questions. Therefore, it is necessary to increase both the number of items and the variation in the level of difficulty of the questions so that the instrument can measure students' abilities more sharply. Overall, this knowledge assessment instrument is good enough to be used in the context of Rasch-based measurement, especially because the students' conformity value to the model is ideal and the distribution of students' abilities is normal. However, to improve the quality of measurement, further development is needed, especially on the reliability and differentiation aspects of the instrument, for example through increasing the number of items and a wider variety of question difficulties in order to cover a more comprehensive and representative spectrum of student abilities. 2 Reliability of attitude instruments Table 5. Summary statistics of attitude Based on the reliability aspect. The person reliability value of 0. indicates that the instrument is reliable enough to distinguish between individuals with different attitudes. The person separation values of 1. indicate that the instrument can distinguish respondents into at least two different ability These results indicate that although still in the moderate category, this attitude instrument has the potential to be further developed to have higher classification acuity. There is one item that deviates significantly from the model. Item10, which has an infit MNSQ value of 95 and an outfit MNSQ of 4. These values are well above the tolerance limit of 2. 0 and are accompanied by high ZSTDs . 82 and 7. , indicating extreme misfit. This item is suspected of not measuring the attitude construct consistently, which could be due to ambiguity, multiple interpretations, or inappropriate context. recommend that items with outfit MNSQ > 2. 0 should be revised or deleted to improve the accuracy of the In. Sci. Ed. Vol. No. September 2025: 153 - 162 In. Sci. Ed. 3 Behavioral Reliability ISSN: 2716-3725 Table 6. Summary statistics of behavior The person reliability value of 0. 75 and the person separation of 1. 71 indicate that the instrument is good at differentiating students' abilities, although not enough to group participants into more than two ability levels in a stable manner. In the context of Rasch modeling, a separation value of at least 2. 0 is recommended to classify students into three different ability levels . Although not ideal, these results suggest that with item improvements or an increase in the number of questions, the reliability and distinctiveness of the instrument can be further improved. This finding is in line with the research by . who emphasized the importance of considering the separation value in evaluating the quality of Rasch-based educational measurement tools. It was concluded that the instrument was sufficiently reliable and feasible to be used to measure behavioral differences between respondents in the research sample. Participants in this study had enough behavioral variation to be mapped by the instrument, so its use in behavioral evaluation or mapping was considered quite effective. Thus, it can be concluded that the instrument has good internal validity in distinguishing behavioral levels between individuals. Figure 1. Standardized residual variance in eigenvalue units. Measuring Energy Literacy: Validation of Knowledge. Attitude, and Behavior A (Inda Hasanah Bimastar. A ISSN: 2716-3725 Unidimensionality is a fundamental right of the Rasch model. Unidimensionality analysis is conducted to ensure the measurement scale accurately measures one domain of energy literacy. Based on the three residual variance analysis results shown, it can be concluded that the knowledge instrument consistently shows a tendency to fulfill the assumption of unidimensionality according to the Rasch model. The three tables show that the proportion of raw variance explained by measures is 31. 4%, 47. 3% and 49. 1% respectively, all of which are above the minimum threshold of 30% as recommended by . thus indicating that one main dimension in the instrument is strong enough to explain the data structure. Meanwhile, the value of unexplained variance in 1st contrast in two of the three analyses had an eigenvalue of . 4013 and 2. 3051 respectively, indicating that the items have a secondary dimension equivalent to three items. The instrument continues to measure predominantly the same construct. However, the examination of the items of the three aspects of the instrument can be declared to have strong unidimensional properties and valid and the items belong to the same construct, namely the energy literacy domain. The results of this instrument validation indicate that the Rasch Model is effective in identifying misfit items, while also providing guidance for revisions based on statistical analyses such as infit, outfit. ZSTD, and PTMEA Corr. A study by Linacre . shows that item evaluation based on MNSQ and ZSTD can detect items with extreme misfit. These results align with previous research in the context of validating science and environmental literacy instruments . , . Additionally, a study by . shows that Rasch analysis on proenvironmental behavior instruments enables the detection of cross-group bias. Positive correlations between items are also an important indicator of construct validity, widely used in the development of attitude and value measurement tools . Reliability findings with separation values < 2. 0 also suggest expanding the range of question difficulty levels, as demonstrated by . in their development of a climate attitude instrument among Considering these validation results, further revisions and testing are needed to obtain an instrument that can comprehensively and deeply describe energy literacy while being applicable to broader populations and regions . , . This study has a unique feature in its application of the Rasch Model as a statistical approach in developing and validating an energy literacy measurement instrument that comprehensively covers the dimensions of knowledge, attitude, and behavior. Unlike most previous studies that used a classical approach (CTT), this study adopted the Rasch approach to produce objective, unidimensional, and evidence-based measurements, as well as to detect statistically misfit items. Additionally, this study specifically designed the instrument for the context of high school students in Indonesia, which has rarely been the focus of modern psychometric-based energy literacy measurement tool development. This studyAos main uniqueness lies in the application of the Rasch Model as a statistical approach in developing and validating an energy literacy measurement instrument that comprehensively covers the dimensions of knowledge, attitude, and behavior. Unlike most previous studies that used a classical approach (CTT), this study adopted the Rasch approach to produce objective, unidimensional, and evidence-based measurements, as well as to detect statistically misfit This study specifically designed an instrument for the context of high school students in Indonesia, which has rarely been the focus of modern psychometric-based energy literacy measurement tool development. This study opens opportunities for the development of similar instruments at different educational levels or in other literacy domains, such as digital and scientific literacy. Additionally, the developed instrument can be used in longitudinal studies to track the development of students' energy literacy over time. CONCLUSION The final results of the assessment instrument consisting of 10 knowledge items, 10 attitude items, and 9 behavior items have been tested using the Rasch model. The test results showed that almost all items met the criteria for fit with the Rasch model and had evidence for the validity and reliability arguments of the instrument. The use of the Rasch Model proved to be very effective in measuring participants' abilities equitably, as well as providing statistical parameters that are able to identify inappropriate or problematic items. In general, this instrument provides a strong basis for evaluation and research activities in the field of energy literacy, and can be adopted for a wider range of populations and educational contexts. In the future, the development of the instrument should continue to be carried out continuously through revision and retesting on a larger and more diverse sample, so that its validity and reliability are increasingly tested and guaranteed. ACKNOWLEDGEMENTS Thank you to all respondent and stakeholder for the permission and opportunity given to carry out this Then, the researcher also thanked all parties who had contributed to the success of this research. REFERENCES