AL-WIJDAN: Journal of Islamic Education Studies. Volume 9. Nomor 3. Juli 2024, p-ISSN: 2541-2051. online -ISSN: 2541-3961 Available online at http://ejournal. id/index. php/alwijdan Received: Mei 2024 Accepted: Juni 2024 Published: Juli 2024 Analysis of Different Power Levels. Difficulty, and Effectiveness of Distracting Final Semester Assessment Questions (PAS) for Arabic Language Class Vi MTs Irsyadul Athfal Ahmad Zaini Dahlan. Fauzie Muhammad Shidiq. Zaky Anwar. Muhammad Syihabul Ihsan Al Haqiqy. Nur Qomari Universitas Islam Negeri Maulana Malik Ibrahim Malang. Universitas Islam Negeri Maulana Malik Ibrahim Malang. Universitas Islam Negeri Maulana Malik Ibrahim Malang. Universitas Islam Negeri Maulana Malik Ibrahim Malang. Universitas Islam Negeri Maulana Malik Ibrahim Malang Email: zainidahlan1998@gmail. com, fauziemuhammadshidiq@gmail. zakyanwarkarim@gmail. com, elhaqiqy123@gmail. com qomari@uin-malang. Abstract This study aims to determine the quality of the final exam items of Arabic language subjects in the 2023 academic year at MTs Irsyadul Athfal Gresik regarding the level of differentiation, difficulty, and effectiveness of exemptions. This research is an evaluation research with a quantitative descriptive approach. The research was conducted with a sample of 13 student answer sheets. The data collection technique used was the documentation technique. The results of this study indicate that . based on the criteria of differential power obtained 11 questions with inferior interpretation, 21 questions with poor interpretation, 1 with moderate interpretation, 9 with good interpretation, and 3 with very good interpretation. based on the difficulty level criteria, 15% of the items were very difficult, 4% of the items were categorized as difficult, 2% of the items were moderate, 13% of the items were categorized as easy, and 64% of the items were very easy. based on the effectiveness of the check obtained 1 item . has a very good checker, then 34 items . 55%) have a good enough checker or revised, and 10 items . that must be replaced because the checker does not work or rejected. The implication of this research is to inform the results of the analysis of the question items to be used as a comparison material for the test scores of Arabic language tests. Keywords: Distinguishing power. Difficulty. Examination. Abstrak Penelitian ini bertujuan untuk mengetahui kualitas butir soal ujian akhir semester mata pelajaran bahasa Arab pada tahun ajaran 2023 di MTs Irsyadul Athfal Gresik dilihat dari segi tingkat daya beda, kesukaran, dan keefektifan pengecoh. Penelitian ini merupakan penelitian yang bersifat evaluasi dengan pendekatan deskriptif kuantitatif. Penelitian dilakukan dengan sampel sebanyak 13 lembar jawaban siswa. Teknik pengumpulan data yang digunakan adalah teknik dokumentasi. Hasil penelitian ini menunjukkan bahwa . berdasarkan kriteria daya beda diperoleh 11 soal dengan interpretasi jelek sekali, 21 soal dengan interpretasi jelek, 1 soal dengan interpretasi sedang, 9 soal dengan interpretasi baik, 3 soal dengan interpretasi baik sekali. berdasarkan kriteria tingkat kesukaran diperoleh 15% butir soal sangat sukar, 4% butir soal dikategorikan sukar, 2% butir soal sedang, 13% butir soal yang dikategorikan mudah, dan 64% butir soal sangat . berdasarkan efektifitas pengecohan diperoleh 1 butir soal . ,22%) mempunyai pengecoh yang sangat baik, kemudian 34 butir soal . ,55%) mempunyai pengecoh yang cukup baik atau direvisi dan 10 butir soal . yang harus diganti karena pengecoh tidak berfungsi atau ditolak. Implikasi penelitian ini ialah untuk menginformasikan hasil analisis dari butir soal untuk bisa dijadikan bahan perbandingan nilai uji tes soal bahasa Arab. Kata kunci: Daya beda. Kesukaran. Pengecohan. Introduction Partially, the assessment is carried In general, tests have a very large out to determine the achievement of role in teaching, including in Arabic learning objectives or to obtain an language teaching. Even the test is in the overview of the position of students in most important position in it. In teaching the flow of the learning process, namely theory, teaching can be seen as a process regarding what students have mastered consisting of three main components that and what they still have to strive to must be present in a teaching, first, teaching 3 In learning Arabic four skills objectives second, teaching implementation must be mastered by learners . istening teaching results. skills, speaking skills, reading skills, and writing Djiwandono mentions in his book that the To find out the mastery of students assessment of teaching results is in the last in the four maharahs use assessment, position in the implementation of teaching referring to the competency standards in and has a very urgent relationship and role Arabic language learning that have been with the previous components, one of 4 Therefore, it is necessary to assess the which is the implementation of learning. results of students at the end of the Because the results of the implementation education unit. Assessment of the final of teaching can be known by using the results of students is carried out by the Whether the goals in teaching are achieved or not achieved. So this last third, assessment of component is very essential in teaching, including Arabic language teaching. end-of-semester The end-of-semester exam is an important evaluation that tests students' understanding and achievement in the Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN material that has been learned during a the levels of the questions tested are not semester, in essence, the end-of-semester measurable and do not know their exam is an important moment that allows Therefore, to find out the level students to measure the extent to which they have understood the subject matter exemption, it is necessary to study and and how ready they are to move on to the review the question items quantitatively next stage. Thus, the end-of-semester exam becomes one of the determining question items has to do with their statistical characteristics. achievement and a tool to evaluate teaching effectiveness. According to Anastasi and Urbina, item analysis can be done qualitatively. MTs Irsyadul Athfal is one of the educational units in Gresik that uses end- about its content and form, as well as of-semester exams to determine the extent Qualitative analysis is to which the teaching process and its more in terms of material, construction, achievements have been achieved. The language, and culture. While quantitative end-of-semester assessment of Islamic Junior High School Irsyadul Athfal uses characteristics of the items including the questions prepared by the Subject Teacher differentiation, difficulty, and exclusion of Consultation Team (MGMP), including the items. Therefore, with the description Arabic learning material which is a clump of the problem above, the researcher is in education based on Islamic religious interested in researching the item analysis Although the final semester of the Arabic language map of class Vi exam questions are prepared by MGMP MTs Irsyadul Athfal, which was prepared and labeled as national standards, the by MGMP with the national standard. items need to be reviewed to find out how the level of differentiation, the level of difficulty, and the level of checking of the items that have been prepared in the Arabic language subject. If the items are not analyzed, it will not know the level of differentiation, difficulty, and checking of the questions tested to students. If no analysis of the items is carried out, then The question that arises is whether the Arabic questions in the End of Semester Assessment at MTs Irsyadul Athfal have met the evaluation criteria or can be used to measure a person's level of considered a failure because they have not mastered the Arabic material taught. Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN Conversely, the level of differentiation, level of difficulty of the final assessment difficulty, and sophistication makes it questions in class V sports physical difficult to understand the questions. education lessons at Pangudang State addition, if item analysis is not carried Elementary School. Purwerejo district, the out, the quality of the items tested cannot results are as follows, a total of 35 items, be measured and the feasibility is unclear. 18 items in the easy category, 13 items in Therefore, this study aims to answer the the medium category, 4 items in the question of what is the quality of Arabic difficult category. The implication of this language subject questions used in the research is to inform the results of the Final Semester Assessment Questions (PAS) analysis of the items to be used as a comparison material for the test scores of Arabic Language Class Vi MTs Irsyadul Athfa for the 2023/2024 academic year in terms of differentiability. Arabic language tests. Method This research is an evaluation The previous studies that support this research . Robi'atul Laili Maulidiyah et al. 2020 Development of Problem Items for Speaking Skills and Writing Skills in Class X students at the senior high school 1 Bojonegoro state Madrassa The results of his research were two designs of the development of the first question item about 25 items of taharah kalam and 25 questions of writing skills. Analysis of the Level of Difficulty and Distinguishing Power of Radiography Level 1 Training Exam Questions, the results are that there are 2 difficult items, 14 medium items, and 22 easy items. While the differentiating power is 7 research with a quantitative descriptive collection techniques. The data was taken from the Final Semester Assessment (PAS) Subject Teacher Conference (MGMP) which had been applied to class Vi students of MTs Irsyadul Athfal Gresik in the odd semester of the 2023/2024 school year, with a research sample consisting of 13 students. The data was analyzed from the aspects of differentiability, difficulty level, and checking with the data analysis technique of the test of differentiability, difficulty, and checking with the help of IBM SPSS Statistics 25 and Excel software. general exam questions, 11 specific ones have low differentiating power. Ratri Laksitaning Dewi. Analysis of the Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN Results and Discussion The next feature, examining the Tests are an important part of the instructional decision-making process as they serve as tools that gather various Therefore, there will be many benefits from developing and using highquality tests. A good exercise has features and characteristics that must be met, the characteristics of a quality test by applicable rules mainly include: . differentiating power, . level of difficulty, and . level of However, distribution of answer choices for the Final Semester Assessment Questions (PAS) for Arabic Language Class Vi MTs Irsyadul Athfal in 2023 is intended to determine whether the available answer choices are functional or not. An answer choice . can be said to be functional if the exemplar: . at least 5% of test takers/students choose them, and . more are chosen by groups of students who do not understand the material. beyond that is differentiating power, which Analysis is the property of a test item that shows that Assessment Questions in terms of there is a difference between clever and less Differentiated Test Questions The differentiating power of a test item, the more likely it is to distinguish clever students from less clever ones. End Semester The ability of a question to sort out high or superior learners from low-ability learners is called differentiating power. 12 The differentiating power of a question is the One of the next characteristics of a ability of a question item to distinguish quality test is the level of difficulty of the students who have mastered the material The level of difficulty indicates how being asked and students who have not difficult or easy each part of the test is. mastered the material being asked. The Reviewing and analyzing the difficulty level of each part of the test can determine according to Aiken in Mutholib, is the ability whether the test is too difficult, difficult, of an item to distinguish between test takers moderate, easy, or too easy. In addition, who have learned and those who have not. the difficulty level of the exam can be The following are the benefits of item calculated from the average score of the discriminating power. To use empirical A high average score indicates that data to improve the quality of each part of the exam was easy, while a low average score the question. The discriminating power indicates that the exam was difficult or very index can be used to determine whether each item is good, revised, or rejected. To Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN determine how far each part of the question question is better able to distinguish can identify or differentiate student abilities, students who have understood the material including students who have understood or from students who have not understood it. not understood the material taught by the The discriminating power index ranges from the question part cannot The distinguish between the two abilities of differentiating power of a question, the students, then the question part can be stronger/better it is. If the discriminating suspected for several reasons, such as the power is negative (<. , it means that there answer key for the question part is incorrect. are lower groups . tudents who do not the answer key for the question part has two understand the materia. Based on the or more correct answer keys. above starting point, there is a standard that discriminating power index indicates that the can be used to determine how much of a difference each particular question item can make as the following standards. Table 1: Discrimination Index The magnitude of the item index number (D) Less than 0. Poor 0,20 - 0,40 Satisfactory 0,40 - 0,70 Good 0,70 - 1,00 Excellent Negative Sign Interpretation The item in question has very weak differentiating power . and is considered not to have good differentiating power. The item in question has sufficient differentiating power . The item in question has good discriminating The item in question has excellent discriminating The item in question has negative discriminating power . ery ba. To find out the differentiating power of BA = number of correct answers in the multiple-choice questions, use the following upper group DP= yaAyaOeyaAyaA ycA ya BB = number of correct answers in the yaycycayc 2. aAyaOeyaAyaA) lower group, ycA N = number of students who took the test taking the test. DP = question differentiating power. Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN The differentiating power obtained from the item analysis of Arabic multiple of the 2023/2024 academic year using SPSS 26 is as follows. choice questions in class Vi odd semester Table 2: Distinguishing Power Item-Total Statistics Scale Scale Cronbach's Corrected Mean if Variance Alpha if Item-Total Interpretation Item if Item Item Correlation Deleted Deleted Deleted 0,00 0,49 Bad -0,12 0,52 Very Bad 0,19 0,48 Bad 0,19 0,48 Bad 0,64 0,43 Good 0,83 0,37 Very Good 0,83 0,37 Very Good 0,41 0,45 Good 0,00 0,49 Bad 0,00 0,49 Bad 0,83 0,37 Very Good 0,64 0,43 Baik -0,25 0,55 Very Bad 0,19 0,48 Bad 0,21 0,47 Bad 0,00 0,49 Bad 0,30 0,46 Medium 0,00 0,49 Bad 0,00 0,49 Bad 0,41 0,45 Good 0,00 0,49 Bad -0,09 0,52 Very Bad -0,09 0,52 Very Bad 0,00 0,49 Bad 0,00 0,49 Bad 0,00 0,49 Bad -0,23 0,54 Very Bad 0,12 0,48 Bad -0,77 0,59 Very Bad 0,00 0,49 Bad 0,00 0,49 Bad -0,36 0,53 Very Bad 0,64 0,43 Baik 0,19 0,48 Bad Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN -0,36 0,00 0,47 0,64 -0,36 0,00 0,47 0,65 0,05 -0,34 -0,25 0,53 0,49 0,43 0,43 0,53 0,49 0,43 0,40 0,49 0,57 0,55 Very Bad Bad Good Good Very Bad Bad Good Good Bad Very Bad Very Bad The differentiating power of the According to Asmawi Zainul, in 14 items is contained in the Corrected Item- The percentage of examinees who answer a Total Correlation table. In the Arabic particular question correctly is called item multiple choice questions of class Vi odd The level of item difficulty is semester of the 2023/2024 academic year, usually represented by a p-value, where a larger p-value indicates that the level of item interpretation of "very bad", namely the difficulty is lower. This means the problem question is not suitable for use because the is getting easier, and vice versa. item concerned has negative differentiating power, 21 questions with the interpretation questions in Arabic language subjects, we of "bad" are considered not to have good must consider the value of the proportion differentiating power, 1 question with a "medium" interpretation, which means that In the Year 2023 End of Semester Assessment, the following steps were taken: differentiating power, 9 questions with a "good" interpretation, namely the item concerned has good differentiating power, and finally 3 questions with a "very good" examinees who answer each item . Observing and correcting the results of the test takers' answers. Entering each test taker's answer item interpretation, which means that the item into the data view of concerned has very good differentiating application, 1 for the correct category and 0 for the wrong category, and Analysis End Semester the SPSS calculating the scores for each item. Assessment Questions in terms of Level Difficulty Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN . Then the calculation is done by finding . Furthermore, from the output of the the average/mean. In general, using the calculation of SPSS Version 25 on the Level of Difficulty of the Arabic yaA formula= ycE ycyc. Where P = difficulty B = number of language items of the End of Semester Assessment of the odd semester of answering correctly. JS = total number 2023, the numbers are obtained as stated of students. Table 3. SPSS Output Results Of Item Analysis Of Arabic Language Questionnaire Final Assessment Odd Semester 2023 Valid Missing Mean 1,0000 0,6923 0,9231 0,9231 0,9231 0,8462 0,8462 0,9231 0,0000 1,0000 0,7692 0,9231 0,6923 0,9231 0,9231 1,0000 0,8462 0,8462 1,0000 0,9231 1,0000 0,3846 0,3077 1,0000 1,0000 1,0000 0,6923 0,6923 0,1538 0,0000 1,0000 0,0769 0,9231 0,9231 0,0769 0,0000 0,8462 0,9231 0,0769 1,0000 0,8462 0,8462 0,8462 0,4615 0,6154 From the results of the data output 20 is very difficult. 40 difficult. view SPSS can be seen in the mean table, 60 medium. 80 easy, and 0. the results of the mean were consulted with 00 very easy. 15 As can be classified by the percentage . tem difficulty index / IKB). researchers in the form of tabulation as Worth 0. 00- 1. with IKB criteria: 0. No. Results Table 4. Consultation Results of Difficulty Index Difficulty Level (%) Description very easy very easy very easy very easy very easy very easy very difficult very easy very easy very easy Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN very easy very easy very easy very easy very easy very easy very easy very easy very easy very easy Easy very difficult very difficult Average very easy very difficult very easy very easy very difficult very difficult very easy very easy very difficult very easy very easy very easy very easy From this table it can be concluded 1 question has TK between 0. that there are 7 items of Arabic subject 60: medium category. 6 items have LD matter for the End of Odd Semester 80: easy category, and as Assessment in 2023 that have a Level of many as 29 questions have LD between Difficulty (LD) between 0. 20: very 00: very easy category. The details of difficult category and as many as 2 questions the status of these items are as follows: Question Status very difficult Very easy Table 5. Status Of Question Items Question Item Number 9, 29, 30, 32, 35, 36, 39 22, 23 2, 11, 13, 27, 28, 45 1, 3, 4, 5,6,7,8, 10, 12, 14,15,16,17,18,19, 20, 21, 24, 25, 26, 31, 33, 34, 37, 38, 40,41,42,43 Total This means that if it is projected. Medium. 13% of the items are categorized there are 15% of the items from the final as easy, and 64% of the items are very easy. assessment of the odd semester of Arabic language subjects in 2023 categorized as very difficult. 4% of the items are categorized as difficult. 2% of the items are And when viewed the average value of the level of difficulty for all items of the final assessment of the odd semester of Arabic subject 2023 is 71%. Thus it can be Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN said that the Arabic language subject matter An option is said to be good if it has the of the odd PAS in 2023 has an easy level of power of distraction . difficulty and cannot be used as a standardized test. A good question is not too easy or too difficult, according to Suharsimi Arikunto. Questions that are too At least 5% of the learners are The easy will not make students try harder to students chooses more than the solve them, while questions that are too group of intelligent students. difficult will make students desperate and not eager to try again because they are out of reach. Analysis Formula: D = A/N y 100% Description: Odd Semester Final Assessment Questions In Terms Of D: Distraction Level (%) A: Number of Students who choose Testability (Distracto. One way to find out how effective an exemplar is by looking at the pattern of student answer distribution. The answer distribution pattern can be calculated by counting how many test takers choose one of the options or do not choose at all. that Option N: Total number of students The criteria are as follows: Eo If D Ou 5% Then accepted because it is good this happens, an exception is considered effective if at least 5% of all test takers have Eo If D O5% and O0 Then it is chosen that option. The more test takers revised/rewritten because it is less who select an exception, the better the exception works. Learners omit if they ignore all options . o not selec. The test is said to be good if the omission is no more Eo If D = 0 then rejected because it is not good than 10% of the learners. Information on reading the level of checking power on the question answer items as follows: Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN Table 6. Distractor level Distractor Level ITEMS SOAL 1 SOAL 2 SOAL 3 SOAL 4 SOAL 6 SOAL 7 SOAL 8 SOAL 9 SOAL Distractor Level ITEMS SOAL 11 SOAL SOAL SOAL SOAL SOAL 19 SOAL ITEMS SOAL 21 SOAL SOAL SOAL SOAL SOAL 29 SOAL ITEMS SOAL 31 SOAL SOAL SOAL 34 SOAL 35 SOAL 36 SOAL SOAL SOAL 39 SOAL ITEMS SOAL 41 SOAL SOAL SOAL 44 SOAL 45 SOAL 14 SOAL 5 SOAL 15 SOAL 16 92% 100% Distractor Level SOAL 24 SOAL 25 SOAL 26 100% 100% 100% Distractor Level There should be a difference in the MTs Irsyadul Athfal Jatirembe Benjeng frequency of answers between upper-group Gresik are not functioning properly because students and lower-group students. It is there is only 1 item . 22%) that has a very known Based on the analysis of the good trigger, then 34 items . 55%) have effectiveness of the triggers that have been good enough triggers or are revised and 10 carried out, most of the triggers on the items . that must be replaced because Final Examination of the Odd Semester of the triggers do not work or are rejected. Volume 9. Nomor 3. Juli 2024. AL-WIJDyAN An exception is said to be functional subject items for the Odd Semester Final if it is chosen by at least 5% of the number Assessment in 2023, categorized as very of students who take the test and is chosen 4% of the items are categorized as by students who do not master the test 2% of the items are medium. Can be said to be misleading if of the items are categorized as easy, and the exception is chosen by most students 21. 64% of the items are very easy. When Of the 45 items tested, there were 4 viewed, the average value of the level of difficulty for all Arabic language subject namely numbers 9, 35, 36, and 39. matter items for the Odd Semester Final Assessment in 2023 is 71%. Thus it can be Conclusion said that the Arabic language subject matter The results of the analysis of the of the Odd Semester Final Assessment in differential power on the multiple-choice 2023 has an easy level of difficulty, in the End-of-Semester Assessment questions of sense that it cannot be used as a standard Arabic language subjects in class Vi odd For the level of checking is not semesters of MTs Irsyadul Athfal in the functioning properly because there is only 1 2023/2024 22%) that has a very good checker, questions with the interpretation of "very then 34 items . 55%) have a fairly good bad", namely the question is very unfit for checker or are revised, and 10 items . use because the item concerned has negative that must be replaced because the checker differentiating power, 21 questions with a does not work or is rejected. "bad" interpretation are considered not to have good differentiating power, 1 question References