MATAI: International Journal of Language Education website: https://ojs3. id/index. php/matail Volume . No. Pp. 1-13 accepted in 16 November 2021 e-ISSN. An Analysis of EFL Teacher-Made Tests: Content Knowledge. Cognitive, and Authentic Evidences Rice Pesiwarissa1*. Pattimura University. Indonesia e-mail: ricepesiwarissa2014@gmail. Karolis Anaktototy2. Pattimura University. Indonesia e-mail: wakwyoya@gmail. Hendrik Jacob Maruanaya3 Pattimura University. Indonesia e-mail: hjmaruanaya@gmail. Abstract Teacher-made tests became a common practice in school to assess the studentsAo mastery of the content knowledge and skills. In the development of the test, multiple-choice test form is frequently a preferred option among the teacher. This study examines the content knowledge, cognitive and authentic evidence of teacher-made multiple-choice tests in EFL learning context in Junior High school. The English midterm-test document for grade 7 as the source of data. The test consists of 25 multiple choice items. The analysis showed that test items functioned primarily at the content knowledge of linguistic competence . %), discourse competence . %), and interactional competence . %). The cognitive functioning level is C1 . %). C2 . %). C3 . %) and C4 . %), while the authentic functioning level is However, there are 84% items that are considered less authentic with the result of the analysis showing the mean score of the raters is 2. Regarding the findings of this study, the teacher needs to develop the authentic tasks in the test. Keywords: teacher-made test. EFL learning, content knowledge, cognitive, authentic evidence DOI: https://doi. org/10. 30598/matail. INTRODUCTION The teacher plays an important role in teaching students to master English well. Therefore, teachers must have good competence in designing, implementing, and evaluating learning. At the end of the learning process, the teacher must measure how far students absorb the material that has been taught by giving a test. The test is designed and used to examine or assess the learnerAos acquired knowledge and According to Brown . , a test is a method of measuring a personAos ability or knowledge in a given area. In other words, we can say that a test is a method of measurement from the materials that have been given. The teacher or some other institution may set and grade the tests. There are two types of tests based on the test-maker. they are standardized tests and teacher-made tests (Arikunto, 2. A teacher-made test is a test design by the teacher to measure the studentsAo acquired knowledge and skill. The teacher-made tests may also be employed as a tool for formative evaluation. The teacher creates the test to determine the student's achievement and competency in a certain area. For that reason, teachers need to make a good quality test, so that the test can measure the studentsAo achievements accurately. BACKGROUND To develop a good quality test, teachers must have a good competence in designing the test According to Brown . , there are five criteria to test the quality of the test namely: practicality, reliability, validity, authenticity, and washback. However, it is still to be questioned whether the test made by the teacher has good quality or not because the teacher rarely tries out and analyzes the test first before giving it to the Mardapi in Widoyoko . stated that there are six steps to develop a test: . Create a table of specifications, . Create stem item of the test, . Analyze the test item, . Do the tryout, . Analyze the item, . Revise. Knowing this fact, the teacher should do the tryout and analyze the test so that the teacher will know the quality of the test. By analyzing the test, the teacher will know which item can be used or revised. Furthermore, by analyzing the test, the teacher will obtain the information to determine the studentsAo progress. Therefore, well-constructed tests can give students the opportunity of assessing their knowledge, and with immediate and constructive the learners can improve their performance. In analyzing the test, validity and reliability are the two major criteria which strongly determine the quality of the test. However, authenticity is also an important criterion that has to be considered in analyzing the test. Citing Bachman and Palmer in Brown . 3, p. authenticity is Authe degree of correspondence of the characteristics of a given language test task to the features of a target language taskAy. Many test items fail to replicate the real-world task, according to Brown . , because they are contrived or unnatural in their attempt to target a grammatical form or lexical item. Furthermore, when discussing authenticity, it is important to define the construct validity and content validity because the concept of authenticity as a welloperated construct is important to achieve sufficient content validity, which in turn helps to ensure that the language tests are accurate in the assessment of the communicative language skills and the level of cognitive required based on the curriculum. These criteria must be considered in analyzing the test so that good quality tests can be obtained. Therefore, this study aims to examine the content knowledge, cognitive and authentic evidence of teacher-made multiple-choice tests in EFL learning context in Junior High school. LITERATURE REVIEW Content Knowledge The content or language teaching area in EFL learning context can be reflecting into communicative competence. According to Celce-Murcia . , there are six communicative competences: sociocultural competence, discourse competence, linguistic competence, formulaic competence, interactional competence, and strategic competence. Sociocultural competence reflects the perception of the speaker about how to behave and express messages appropriately within the overall social and cultural sense of communication, in line with the pragmatic factors relevant to language variation. Discourse competence involves the selection, sequence and arrangement of words, structures, sentences and expressions for the achievement of a unified spoken or spoken word. Content areas that distribute to discourse competence are: cohesion, deixis, coherence, and generic structure. Linguistic competence involves the essential elements of communication and include four types of knowledge: the morphological include parts of speech, grammatical inflection, productive derivational processes, the lexical knowledge of both content words . oun, verbs, adjectiv. and function words . ronouns, determiners, prepositions, and verbal auxiliarie. , as well as the phonological for pronunciation and the syntactic include constituent/phrase structure, word order, basic sentence type, modification, coordination, subordinating and embedding. Formulaic competence involves the fixed and prefabricated parts of language that speakers use often in everyday interactions which include the fixed phrases and formulaic chunks . f course, how do you do, et. , collocations . erb object such as : spend money, adverb-adjective such as mutually intelligible, and adjective noun such as tall buildin. , idioms, and lexical frames. Interactional competence involves at least three sub-components: actional competence: knowledge of how to perform common speech acts and speech act sets in the target language. conversational competence includes knowing how to initiate and end conversations, establish and change topics, and gain, hold, and yield the floor. and nonverbal/paralinguistic competence, which involves body language, non-verbal turn-taking signals, gestures, eye contact, batch channel behaviors, haptic behavior, proxemics, and non-linguistic utterances. Strategic competence refers to the mastery of communication strategies which include the communication strategies. Cognitive Level According to the updated version of Bloom's Taxonomy, there are six levels of cognitive learning (Anderson, 2. Each level has an own conceptual framework. Remembering (C. retrieving, recalling, or recognizing important information from long-term memory. Understanding (C. convey understanding by one or more types of explanation. Applying (C. apply knowledge or a skill in a new situation. Analyzing (C. divide a substance into its basic pieces and identify how the parts relate to one another and/or to an overall structure or purpose. Evaluating (C. make decisions on the basis of criteria and standards and Creating (C. combine parts to create a new logical or effective entity or reorganize components to create a new pattern or structure. Authenticity Bachman and Palmer in Brown . define authenticity as the degree of correspondence of the characteristics of a given language test task to the features of a target language task, and then suggest an agenda for identifying those target language tasks and for transforming them into valid test items. According to Brown . , in a test, authenticity may be present in the following ways: . the language in the test is as natural as possible. items are contextualized rather than isolated. topics are meaningful . elevant, interestin. for the learner. some thematic organization to items is provided, such as through a story line or episode. task represent, or closely approximate, real-world task. METHODOLOGY The researcher employed descriptive quantitative research design. Ary et al . states that quantitative research uses objective measurement to gather numeric data that used to answer questions or predetermined hypotheses. This study was conducted in SMP Negeri 9 Ambon which was located on Jalan Wolter Monginsidi. Lateri-Ambon. The sample of this study was the test document since this study needed to analyze an English midterm test for grade VII, made by a teacher. The test document was English midterm test in the second semester of the academic year 2020/2021 for grade 7. The procedure of data collection involved instrument development and validation before the research took place. There were two major techniques of data collection used in this study namely test administration and document analysis. The test document was analyzed to obtain the data of authenticity of the test and EFL content knowledge and cognitive level in the multiple choice test made by the teacher. In collecting the data, the researcher used documents and a checklist table as research instruments. To analyze the content of the test, the researcher constructed an indicator /checklist by referring to the content and standard competency of EFL teaching and learning for SMP grade VII based on curriculum 2013 and the concept of content knowledge which categorized into communicative competence by Celce Murcia . In terms of cognitive level, the analysis was based on BloomAos revised taxonomy (Anderson,2. To analyze the authenticity of the test, the researcher constructed the authenticity checklist table which developed based on criteria set by Brown . There were three raters to assess the authenticity of the test items. The first rater is an English Study Program lecturer. He is a senior lecturer who teaches English Language Testing. The second rater is an English supervisor of Dinas Pendidikan Kota Ambon and the third rater is the researcher. Data about content knowledge and cognitive level of the test that have been collected were analyzed quantitatively. The researcher calculated the frequency and found out the percentage of the test item by using the percentage formula. yce % = ycAx100 Where: % = percent f = frequency N =of number of have been collected were analyzed by comparing the Data of authenticity the test that score from the three raters. To know the extent of the authenticity of the test and level of agreement of the raters, the researcher used Pearson Product Moment with the assistant of the SPSS application to correlate the data. The size of the relationship among the ratersAo scores are expressed in numbers called the correlation coefficient (R) that is between -1 to 1 where if approaching 1 then the relationship is stronger and positive. Meanwhile, if close to -1 then there is a stronger relationship but the direction is negative. If the correlation coefficient is zero then it means there is no relationship at all among the ratersAo scores. Napitulu et al . provide the interpretation of correlation coefficient in the table below: Table 1. Interpretation of Correlation Coefficient Coefficient Interval Correlation 00 Ae 0. Very Weak 20 Ae 0. Weak 40 Ae 0. Medium 60 Ae 0. Strong 80 Ae 1. Very Strong FINDINGS The content of the test developed by the teacher of SMP Negeri 9 Ambon was analyzed based on the conceptual framework of communicative competence (Murcia,2. There are six elements to represent language communicative competence. They are sociocultural competence, discourse competence, linguistic competence, formulaic competence, interactional competence, and strategic competence. From the content of the test analysis, it was found that the test represents the discourse, linguistic and interactional competence, while sociocultural, formulaic and strategic competence were not included in the test items. The description of each competence tested in the midterm test is described below. a Discourse Competence Discourse competence relates to Aothe selection, sequencing, and arrangement of words, structure, sentences, and utterances to achieve a unified spoken or written textAo (Murcia, 2. From the analysis of the test content, it was found that 7 questions were constructed to test this knowledge on students. Midterm Test Item no Total =6 Table 2. The Result of Discourse Competence Analysis Element of Discourse Competence Cohesion Deixis Coherence Generic structure 8,17,18,19 9,16 4 items 2 items The data show that from the 25 multiple choice test items, six test items dealt with coherence and generic structure as a part of discourse competence. Knowledge on coherence accounts for 4 items while generic structure accounts only for 2 items. To measure the percentage of the test items, the percentage formula is used. Therefore the percentage for the coherence is 16% and generic structure is 8%. a Linguistic Competence Linguistic competence relates to syntactic . entence pattern. , morphology, lexical, and From the analysis of the test content, it was found that 13 questions were constructed to test this knowledge on students. Midterm Test Item no Total = 13 Table 3. The Result of Linguistic Competence Analysis Element of Linguistic Competence Syntactic Morphological Lexical 1,2,3,4,10 22,23,24 11,12,13,14,15 10 items 3 items Phonological The data show that 13 test items dealt with linguistic competence include syntactic knowledge which consist of 10 items . %) and lexical knowledge account for 3 items . %), while morphological, and phonological gain no place in the test. a Interactional Competence Interactional competence deals with the actional competence and conversational The actional competence relates to knowledge of language function which covers interpersonal exchange, information, opinions, feelings, suasion, problems, and also future scenarios, while the conversational competence relates to knowledge of extending the Based on the content of the test, it was found that 6 questions were developed to assess students' knowledge of interactional competence. Midterm Test Item no Total = 6 Table 4. The Result of Interactional Competence Analysis Element of Interactional Competence Actional Competence Conversational Competence 5,6,7,20,21,25 6 item The data show that 6 items relate to actional competence which cover asking and giving factual information and interpersonal exchange while conversational competence gains no place in the test. The percentage of the actional competence in the test is 24%. The following table shows the whole competence elements tested in the midterm test constructed by the teacher. Table 5. Content of The Test In Term of Communicative Competence Framework Content Total test No. Percentage Knowledge Item Sociocultural Discourse Linguistic Interactional Formulaic Strategic Linguistic competence accounts for 13 items . %) and it represents the majority of content knowledge followed by discourse competence 6 items . %) and interactional competence 6 items . %). Sociocultural competence, formulaic competence and strategic competence were not found in the content of the test. The analysis of the content of the test items in terms of the cognitive level was based on BloomAos revised taxonomy (Anderson,2. which cover the lowest order thinking skill to the highest order thinking skill. The result analysis of the cognitive level can be seen in the table Table 6. The Result of Cognitive Level Found In The Test Cognitive Level Detail of Basic Competence Remembering the structure of the descriptive Understanding the structure of the descriptive text by giving information related to the description of people. Understanding the language feature of descriptive text by giving information related to animal descriptions Understanding the social function of descriptive text by giving information related to the description of objects. Understanding the structure of descriptive text by giving information related to the description of objects Identifying the language feature of transactional texts that involve the act of giving and asking for information related to the character of objects. Identifying the language feature of transactional texts that involve the act of giving and asking for information related to the character of people. Applying language feature of descriptive text by giving information related to people's Applying descriptive text structure by giving information related to the description of Item No Total Percentage 24,25 3,4,10 C4 Applying the language feature of descriptive text by giving information related to animal Applying the structure of the descriptive text by giving information related to animal Analyzing the structure of the descriptive text by giving information related to the description of people. Analyzing the structure of transactional texts involving the act of giving and asking for information related to the character of people. Analyzing the structure of transactional texts involving the act of giving and asking for information related to the character of objects Comparing the language feature of descriptive texts by giving information related to animal Comparing the structure of the descriptive text by giving information related to the description of objects Comparing the structure of descriptive texts by giving information related to people's 13,14 12,15 In terms of the intended cognitive level present in the developed test, it was found that the cognitive level 2 (C. , level 3 (C. and level 4 (C. were dominant. Each is equal in number which is 8 items . %) respectively, while the cognitive level 1 (C. accounts for only one item . %). The authenticity of the test is measured based on BrownAos theory of authenticity in language tests that contain: natural language use, contextualized rather than isolated, relevancy, thematic organization, and real-world like tasks. Likert scale of 3 to 1, with 3 indicating AoagreeAo, 2 indicating Aoslightly agreeAo and 1 indicating Aodisagree Ao, was used to assess the authenticity of the developed test. There were three raters including the researcher to assess the authenticity of the test. The first rater is a senior lecturer from the English Study Program who teaches English Language Testing. The second rater is an English supervisor of Dinas Pendidikan Kota Ambon and the third rater is the researcher. The five authenticity criteria in the table are coded: . natural language use, . contextualization, . relevancy, . thematic organization and . real word task. The result of the analysis of the authenticity can be seen in the table below: Table 7. The Result of Authenticity Analysis 1 Item Rater 1 3 4 5 3 1 3 3 1 3 3 1 3 3 1 3 3 3 3 3 1 3 3 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 Average Rater 2 3 4 5 Average 3 1 3 3 1 3 3 1 3 3 1 3 3 3 3 3 1 3 3 1 3 3 3 3 3 3 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 Rater 3 3 4 5 Average 1 1 1 1 1 1 1 1 1 1 1 1 3 3 1 3 2 1 3 2 1 3 2 1 3 1 1 3 3 3 3 3 1 3 3 1 3 3 1 3 3 1 3 3 1 3 3 1 3 3 1 3 3 1 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 In terms of the authenticity of the developed test item, the result of the analysis shows that the three raters provided different scores related to all the authenticity criteria in the test The data show that the majority of the test items can fulfill the relevancy criteria which account for 21 items ( 84% ), 18 items meet the contextualized criteria . %) and 16 items can fulfill the thematic organization criteria . %). On the other hand, there are only 8 items that meet the criteria of representing the real word task . %), while the criteria of natural language use only account for 6 items . %). Overall, the analysis of the authenticity can be seen in the table below: Item Table 8. The Result of Authenticity Analysis 2 Score Average Rater1 Rater 2 Rater 3 The existing data show that only 4 items are considered to meet the authenticity criteria which are shown through the natural language, contextualized, meaningful topics, thematic organization and represent, or closely approximate to the real-world task. The items assessed meet the five criteria are items number 20, 21, 22, and 23. Therefore, those 4 items . %) can be declared authentic. On the other hand, there are 21 items . %) that are considered less authentic by the three Although they differ in the results of the analysis, they do not show a significant difference in score as shown in the table below. Table 9. Descriptive Statistic Range Min. Max. Mean The minimum score given by the raters was 2. 10, but the maximum score was 3. 00 and the mean was 2. Furthermore, the results of the analysis of the three raters were then correlated using SPSS application to measure the extent of the level of agreement of the three raters to the authenticity of the test. The correlation results are shown in the following table: Table 10. The Results of Authenticity Correlation Analysis Correlations Rater_1 Rater_2 Rater_3 Rater_1 Pearson Correlation Sig. -taile. Rater_2 Pearson Correlation Sig. -taile. Rater_3 Pearson Correlation Sig. -taile. **. Correlation is significant at the 0. 01 level . -taile. The coefficient correlation between rater 1 and rater 2 is 0. 974, therefore we can say that the level of the correlation is very strong. The coefficient correlation between rater 1 and rater 3 794, so we can say that the level of the correlation is strong. The coefficient correlation between rater 2 and rater 3 is 0. 836, it is categorized as very strong. DISCUSSION Content knowledge and language skills are two elements that can not be separated from language learning. These two elements support one another to be a proficient language user. Content knowledge such as discourse, linguistic and interactional knowledge, to a certain degree, play an important role in reading, speaking, and writing. From the analysis, it is evidence that content knowledge such as discourse, linguistic and interactional knowledge are integrated in speaking, reading and writing. In discourse competence, the students closely engage with how to construct the written or spoken text. On one hand, they need to have the relevant schemata in terms of selecting, sequencing, and also arranging the words, phrases, clauses, sentences, and also utterances in order to create unified information to convey in the context of communication. Discourse knowledge which deal with coherence integrated into studentsAo writing, speaking and reading Cohesion and coherence are also essential for students to interpret the text well. It is in line with a study conducted by Pan . which found that in many international tests such as IELTS and TOEFL, coherence is an important marking criterion for writing . ritten or spoken tex. therefore students are required to write English compositions coherently. Linguistic knowledge is also related to students' writing skills as stated in Menggo et al . that English writing skill encourages students to employ their understanding of micro linguistics, i. , morphology, syntax, lexicon and semantics that have already been learned in English class. Whereas, actional competence is integrated in speaking and writing skill since language is an effective means of expressing ideas and feelings, asking and giving information both in spoken and written form to communicate or interact with other people (Anggraini,2. Moreover, it gives students a way to communicate and help them socialize in society. In terms of studentsAo level cognitive ability, the multiple choice items developed by the teacher can measure studentsAo cognitive level include level C1. C2. C3, and C4, while there is no item to measure studentsAo higher order thinking skills C5 and C6. It is difficult for the teacher to assess the studentsAo high order thinking skills which cover studentsAo creativity. Carneson et al . argued that creativity cannot easily be tested by using multiple choice Discursive questions, such as the "Essay-type" question, are ideal for testing However. Scully . argued that multiple choice items have the capacity to assess higher-order thinking skills. Therefore, teachers need to learn the strategies for constructing multiple choice items to assess studentsAo higher-order thinking. Based on the result analysis of the authenticity of the test, this research found that the four multiple choices which are considered authentic meet five criteria as proposed by Brown. The researcher recognized that most of the test tasks had problems fulfilling the naturalness of language used in the test instructions, stems and the optional answers. Despite the fact that the language test was not designed to measure specific grammatical or lexical issues, the teacher should minimize linguistic errors in order to provide a highly authentic reading test. In order to minimize test takers' difficulty in comprehending the test instructions, there should be no linguistic faults in the test tasks, such as typographical errors, lexis, word ordering, grammar . yntactic concern. , and diction. It is also found that most test texts face problems to fulfill the naturalness of language used in the test passages and the real-world representativeness. Even though the topics of the passages were reasonable and based on real world context, almost all of the passages included in the reading exam failed to portray the real world context. Authentic reading passages are derived from real-world sources, however the teacher who developed the English test items did not mention the sources where the passages were taken from. CONCLUSION AND SUGGESTION Based on the results of the research, the researcher concludes that the content knowledge which cover discourse competence, linguistic competence and interactional competence are three main content knowledge tested for students of grade seven in junior high school since it is in accordance to the curriculum of junior high school. In terms of measuring the studentsAo cognitive level, the test items only measure the lower level to the middle level thinking skill, while the higher order thinking skill is not found in the test item. Therefore, teachers need to learn the strategies for constructing multiple choice items to assess studentsAo higher-order thinking which cover analytical, critical or creative thinking. Concerning to the authenticity of the test, most of the problem appear in the test items are the naturalness of language used in the test passages and the real-world representativeness. Therefore, teachers need to avoid typographical mistakes, some lexical problems, and the unknown sources of reading passages in order to avoid test takersAo confusion in understanding the test tasks. REFERENCES