Integrated Science Education Journal Vol. No. January 2025, pp. ISSN: 2716-3725. DOI: 10. 37251/isej. Development and Validation of a Taxonomy for Specific Questions Based on Deficiencies in Logical Reasoning Li Qiu1. Fumihito Ikeda2,*. Naoko Yamashita1 1Department of Natural History Science. Graduate School of Science. Hokkaido University. Hokkaido. Japan 2Faculty of Liberal Arts. Science and Global Education. Osaka Metropolitan University. Osaka. Japan Article Info ABSTRACT Article history: Purpose of the study: This study aims to develop a taxonomy of specific questions based on deficiencies in three types of logical reasoning: inductive, deductive, and hypothetical reasoning. The study also seeks to validate the quality of this taxonomy. Received Aug 12, 2024 Revised Oct 21, 2024 Accepted Dec 23, 2024 OnlineFirst Jan 15, 2025 Keywords: Logical Reasoning Research Questions Specific Questions Taxonomy Methodology: This study is a developmental research project that utilized a Authree-level modelAy combining deductive approaches with empirical data analysis to develop and validate a taxonomy of questions. A convenience sampling method was employed, whereby 57 questions were selected from 1,164 posed by graduate students at a university in Japan. The question data were categorized by two raters. Descriptive statistics and the kappa coefficient method was employed to verify the quality of the categorization. Main Findings: The taxonomy of specific questions comprised nine categories, with three categories assigned to each type of logical reasoning. The comprehensiveness assessment results showed that all questions were assigned to one category, with at least one question included in each category. The results for exclusivity and objectivity revealed high kappa coefficients, indicating a high degree of agreement between raters. However, there was some confusion among question categories within the same reasoning method during the categorization Novelty/Originality of this study: The development of a taxonomy of specific questions based on logical reasoning deficiencies. This framework facilitates the design of strategies that enable learners to generate specific questions. The proposed method of assigning questions enables an objective assessment of learnersAo proficiency in formulating specific questions and is expected to more effectively guide the support process. This is an open access article under the CC BY license Corresponding Author: Fumihito Ikeda. Faculty of Liberal Arts. Science and Global Education. Osaka Metropolitan University. Osaka 558-8585. Japan Email: fumike@omu. INTRODUCTION Research questions are an indispensable element of scientific inquiry . Qiu et al, highlighted the importance of logic in science, suggesting that questions probing deficiencies in the three types of logical reasoning inductive, deductive, and hypothetical can potentially evolve into research questions, thereby contributing to advances in science and technology . Thus, the ability to ask such questions, referred to as Auquestioning abilityAy, is crucial in formulating research questions. To assess this ability, a method has been developed that classifies questions based on three common elements of logical reasoning: premises, rules, and Questions containing only one of these elements are considered ambiguous, whereas those that specify Journal homepage: http://cahaya-ic. com/index. php/ISEJ In. Sci. Ed. ISSN: 2716-3725 two elements and inquire about the remaining element are categorized as first-level questions probing the reasoning itself. Questions that include all three elements and further explore the relationships between them are classified as second-level specific questions . An analysis of questions posed by graduate students in a course revealed a tendency to ask ambiguous or first-level inductive questions, with only 57 out of 1164 questions reaching the second level specific questions, indicating a lack of sufficient understanding in questioning logical reasoning deficiencies . A taxonomy is a classification system that groups similar objects within a specific domain based on different characteristics, providing a set of decision rules . By presenting question categories, learners can understand the types of questions that are expected to be generated and grasp the function of each category, thus facilitating question generation . Additionally, in the process of developing learnersAo abilities and performance, it is crucial to employ objective methods and rules . , . This approach enables accurate assessment of learnersAo skills and facilitates the provision of effective feedback, thereby enhancing learning outcomes . Therefore, to enable learners to generate specific questions probing logical reasoning deficiencies, it is necessary to first provide a taxonomy that includes categories of specific questions and an objective method for assigning individual questions to these categories. Previous research on question taxonomies has been primarily based on cognitive levels or For example. Marbach-Ad and Sokolove . categorized studentsAo questions in science classes based on the required thinking levels into questions based on basic misconceptions, questions about concepts and facts, and questions seeking information extension, proposing respective classification criteria. Similarly. Chin . categorized questions based on the cognitive processes required for completing tasks, such as decisionmaking into explanatory questions that utilize causal thinking, hypothetical questions that test hypotheses, and evaluative questions that involve comparisons, providing prompts for each type of question. By utilizing a taxonomy based on cognitive levels, the generated questions integrate new information with existing knowledge and reconstruct learnersAo schemas, thereby enhancing understanding and learning, or supporting the progression of specific tasks. Meanwhile, journalistic questions, commonly categorized as the 5W and 1H Who. What. Where. When. Why, and How are well known to effectively organize information and fully express ideas on specific topics in writing . , . Learners do not sufficiently understand how to formulate questions that probe deficiencies in logical reasoning, which are crucial in scientific inquiry. Despite this, the development of a taxonomy designed to aid in understanding these questions has not yet progressed. Thus, this study aims to develop and validate a taxonomy of specific questions based on deficiencies in the three types of logical reasoning. The research questions are as follows: What are the categories and assignment methods of specific questions?. To what extent is the quality of the taxonomy of specific questions maintained? RESEARCH METHOD Developmental research is the systematic study of designing, developing, and evaluating instructional programs, processes, and products to ensure they are internally consistent and effective . This study is a developmental research project that develops and validates a new method for categorizing questions. To develop a taxonomy of specific questions, a three-level model that includes conceptual, empirical and indicator levels, was adopted . This model is recognized as a general and useful method for taxonomy development. According to Emamjome et al, the research begins by deductively constructing the taxonomy based on theoretical foundations . onceptual approac. This is followed by an examination of empirical cases . onceptual to empirica. to assess its alignment with the conceptualization. The research process was structured into two stages, according to this three-level model. In the first stage, a conceptual approach was employed to deductively develop a taxonomy of specific questions. This stage involved constructing specific question categories and developing a question assignment methodology. In the second stage, the quality of the taxonomy was quantitatively assessed using a corpus of questions from course Constructing of Specific Question Categories Based on three perspectives on questioning the logic of inference: Perspective 1 probes the set of propositions that form the starting point of the inference. Perspective 2 questions the validity of the connections between each proposition, and Perspective 3 queries the process of inferring a conclusion from these sets of propositions . , specific question categories were constructed. Inductive reasoning, which involves deriving rule from multiple premises and results, consists of stages such as the acquisition of propositions, and hypothesis derivation through generalization from the collected propositions . These stages have inherent limitations. In the proposition acquisition phase, premature generalizations based on a few or biased instances are common . During the hypothesis formation phase, there is a tendency to derive regularities between repeating events or temporally adjacent events, assuming the Development and Validation of a Taxonomy for Specific Questions Based on Deficiencies . (Li Qi. A ISSN: 2716-3725 Principle of Uniformity of Nature that similar conditions will lead to similar results which may overlook undiscovered conditional propositions . Furthermore, the ease of inducing regularity from similarities among instances can lead to the pitfall of OccamAos Razor the simpler theory is better which might ignore the differences between individual instances or conditions . Based on these limitations, three categories of specific questions are proposed (Figure . : A1. AuQuestions Probing Premature Generalization,Ay questioning the possibility of generalizing based on a limited number of instances or biased examples. A2. AuQuestions Probing Fluctuations in Uniformity,Ay questioning the robustness of the relationship between premises and results. and A3. AuQuestions Probing Occam's Razor,Ay questioning the validity of the generalization process based on premises and results. Figure 1. Question categories related to deficiencies in inductive reasoning Deductive reasoning involves applying existing rules to premises to predict new result . Similar to inductive reasoning, deductive reasoning also comprises the stages of proposition acquisition and result prediction, with similar limitations. Therefore, three categories are proposed (Figure . : B1. AuQuestions Probing Premature Prediction,Ay questioning the issue of predicting result too quickly by ignoring implicit premises and B2. AuQuestions Probing Fluctuations in Logic,Ay questioning the robustness of the relationship between individual premises and rules. and B3. AuQuestions Probing Occam's Razor,Ay questioning the validity of predicting result from premises and rules. Figure 2. Question categories related to deficiencies in deductive reasoning Hypothetical reasoning involves applying rules to observed results to infer premise retroactively . thus, the question categories for hypothetical reasoning are similar to those for deductive reasoning. Consequently, three necessary categories are established (Figure . : C1. AuQuestions Probing Premature Hypothesization,Ay questioning the issue of quickly inferring premise with insufficient rules applied to observed C2. AuQuestions Probing Fluctuations in Logic,Ay questioning the robustness of the relationship between the results and individual rules. and C3. AuQuestions Probing Occam's Razor,Ay questioning the validity of the process of inferring premise based on results and rules. In. Sci. Ed. Vol. No. January 2025: 6 - 14 In. Sci. Ed. ISSN: 2716-3725 Figure 3. Question categories related to deficiencies in hypothetical reasoning Developing a Question Assignment Methodology A method was devised to assign questions to the aforementioned categories. Initially, the specific elements of premise, rule, and result identified in the question text are used to determine which type of logical reasoning the question pertains to. Subsequently, the remaining elements of premise, rule, and result in the question text are further specified, and the type of deficiency being questioned is determined based on the following criteria, assigning each question to the appropriate category: Questions Probing Deficiencies in Inductive Reasoning A1 Questions Probing Premature Generalization: addressing the sufficiency and comprehensiveness of premises and results needed for rule derivation. A2 Questions Probing Fluctuations in Uniformity: addressing deficiencies in premises affecting results. A3 Questions Probing OccamAos Razor: addressing the validity and distortion of information removal during the process of abstracting and deriving rule from multiple premises and results. Questions Probing Deficiencies in Deductive Reasoning B1 Questions Probing Premature Prediction: addressing the sufficiency of premises and rules needed for result prediction. B2 Questions Probing Fluctuations in Logic: addressing the logical validity of rules applied to individual premises. B3 Questions Probing OccamAos Razor: addressing the validity and distortion of information removal during the process of predicting result from premises and rules. Questions Probing Deficiencies in Hypothetical Reasoning C1 Questions Probing Premature Hypothesization: addressing the sufficiency of results and rules needed for premise inference. C2 Questions Probing Fluctuations in Logic: addressing the logical validity of individual rules applied to results. C3 Questions Probing OccamAos Razor: addressing the validity and distortion of information removal during the process of inferring premise from results and rules. Data Collection and Analysis Based on the criteria established in previous studies . , . , . , the quality indicators for the taxonomy of specific questions were defined as follows: Comprehensiveness: Every question of interest fits into one of the categories, and each category contains at least one question. Exclusivity and Objectivity: Each question fits into only one category, with high agreement in category assignments among different raters. To validate the quality of the taxonomy of specific questions, this study adopted a convenience sampling method, considering the feasibility of the research and optimal use of limited resources. Convenience sampling, an effective method under resource-limited conditions, involves selecting samples from an accessible Development and Validation of a Taxonomy for Specific Questions Based on Deficiencies . (Li Qi. A ISSN: 2716-3725 population . As mentioned earlier . , the study employed the Auself-describing question approach. Ay A total of 1,164 questions were collected from 24 graduate students who attended the AuSocio-constructivism for Science and Technology IAy course at H University in Japan, from April to May 2020. In this study, 57 questions that were identified as specific were selected from the pool of 1,164 questions. The course covered seven themes based on the importance and curiosity of questions, human perceptual characteristics, visual characteristics, characteristics of consciousness and attention, the importance of better questions, the definition of science, and science from the perspective of social constructionism, with each theme conducted over 90 minutes. The task of assigning categories to the selected specific questions was independently performed by the first author and another rater from the same laboratory, both of whom were course participants. It is often beneficial to use a second rater from the same field to check the initial quality of categories . Therefore, raters belonging to the same community were used, and their results were compared. Furthermore, to more objectively validate the quality of the taxonomy, raters were instructed to assign a question to multiple categories if they deemed it fit for more than one. To verify the comprehensiveness of the categories, the frequency of questions assigned to each category was calculated. Additionally, to assess the exclusivity of the categories and objectivity of the assignment method, the results of the category assignments were analyzed using R language, and CohenAos kappa coefficient was calculated. This metric is more informative than mere agreement rates as it considers the chance occurrence of agreement . A kappa coefficient of 0. 61 or higher is considered to indicate high exclusivity and objectivity of the categories and assignment methods . RESULTS AND DISCUSSION Categories of Specific Questions Nine specific question categories were finalized, belonging to three types of logical reasoning: inductive, deductive, and hypothetical. For inductive reasoning, the categories included AuA1 Questions Probing Premature Generalization,Ay AuA2 Questions Probing Fluctuations in Uniformity,Ay and AuA3 Questions Probing OccamAos Razor. Ay In deductive reasoning, the categories were AuB1 Questions Probing Premature Prediction,Ay AuB2 Questions Probing Fluctuations in Logic,Ay and AuB3 Questions Probing OccamAos Razor. Ay For hypothetical reasoning, the categories were AuC1 Questions Probing Premature Hypothesization,Ay AuC2 Questions Probing Fluctuations in Logic,Ay and AuC3 Questions Probing OccamAos Razor. Ay Quality of the Taxonomy of Specific Questions The results of the category assignments made by the two raters for these 57 specific questions are presented in Table 1. This table demonstrates that both raters assigned all questions into the nine categories, with at least one question included in each category. Table 1. Results of category assignments between raters Question Category Rater1 Rater2 Total The agreement rate for category assignments between the two raters was 84%. Based on this, the interrater reproducibility, represented by CohenAos kappa coefficient, was calculated to be 0. 82, with a 95% confidence interval ranging from 0. 77 to 0. The counts of agreement in category assignments between the raters are shown in Table 2. Specific examples of questions with consistent classification are provided below: A1 Questions Probing Premature Generalization : From AiAos experiments, it was concluded that. chimpanzees cannot connect more than two things. However, can other chimpanzees of the same species as Ai-chan not necessarily learn to move a stepping stool to the appropriate place? A3 Questions Probing OccamAos Razor: I think there are individual differences in how much experience. affects the formation of gestalt. Can we ignore these differences? In. Sci. Ed. Vol. No. January 2025: 6 - 14 In. Sci. Ed. ISSN: 2716-3725 B2 Questions Probing Fluctuations in Logic: The section on metaphors mentions the term "gender " In the story of Momotaro, it is indeed written in an era where there were strong norms about men working outside and women inside. While it may be unavoidable considering the time it was written, should we also consider the scarcity of female students in modern studies as part of this gender bias? Table 2. Number of agreements on category assignments between raters Question Category Number Total The combinations of category-assignment disagreements between the raters are shown in Table 3. From this table, it is evident that all disagreements occurred within the same reasoning method, with no disagreements across different reasoning methods. Specific examples of these disagreement combinations are provided below: A2 / A3: We learned that xanthophyll can reduce chromatic aberration, but is there no need to consider the differences in xanthophyll content? B2 / B3: Observations and experiments are crucial elements of science, and it could be argued that alchemy is also scientific if one considers the repeated observations and experiments during its trial and development processes. However, can these observation and experimentation processes be equated with scientific processes? C1 / C2: It is difficult for a system engineer to ask questions from the perspective of a help desk due to the use of communication methods where the interlocutorAos face is not visible. However, wouldnAot it also be challenging to ask questions empathetically even when the interlocutorAos face is visible if there is a lack of empathy? Table 3. Combinations and number of disagreements in category assignments Category Combination Number A1/A2 A1/A3 A2/A3 B1/B2 B2/B3 C1/C2 C2/C3 Total The specific questions were categorized into nine categories, with three categories each for inductive, deductive, and hypothetical reasoning. The results of this study indicate that the taxonomy of specific questions is, comprehensive, as evidenced by the complete assignment of all questions to the nine classification categories listed in Table 2. Additionally, the high kappa coefficient obtained suggests a high degree of agreement among different raters regarding the assignment of questions to categories, implying the robust exclusivity of categories and objectivity of the assignment method. The categories of specific questions, classified based on deficiencies in three types of logical reasoning, are suggested to be comprehensive and exclusive, adhering to the principles of classification. Furthermore, the findings of high objectivity in the assignment method align with the research results of Qiu et al. , suggesting that consistent outcomes can be achieved using the question assignment method, even among different raters, provided there is a certain level of knowledge and common understanding regarding the subject matter and the classification system of specific questions. Moreover, the overall high quality of the taxonomy suggests that it can be applied to classes aimed at cultivating fundamental inquiry skills, such as setting research questions, as well as to inquiry-based lessons in related fields. However, as indicated by the inconsistencies in Table 3, the occurrence of disagreements within the same reasoning method suggests potential confusion among question categories within the same method. This Development and Validation of a Taxonomy for Specific Questions Based on Deficiencies . (Li Qi. A ISSN: 2716-3725 confusion is likely due to a lack of specificity in the assignment criteria. For instance, the question example judged as B2/B3 illustrates an issue of deductive reasoning, wherein the AuhoweverAAy part identifies premises and rules, but interpreting these according to the criterion for B2. Authe logical validity of rules applied to premises,Ay it queries the appropriateness of applying scientific rules to the premises of observations and experiments during alchemical development. Conversely, according to the criterion for B3. Authe validity of information deletion in the predictive process,Ay it could also be seen as questioning the legitimacy of deriving results by overlooking the differences between observations and experiments in alchemy and science. The purpose of using a taxonomy significantly influences its construction . , and previous studies have developed question taxonomies based on distinct objectives. For example. Ikeda . defined Authe ability to ask questionsAy as comprising two capabilities: the ability to notice implicit information and the ability to express such perceptions without bias in an adequate question form. Upon this foundation, he constructed a method for classifying questions based on three ways in which information can be obscured when verbalized. The first type is Auomission,Ay where unnoticed information is simply ignored. The second type. Audistortion,Ay occurs when unnoticed information is misrepresented by the way it is articulated. The third type is Augeneralization,Ay where unnoticed information is treated as identical to noticed information. Prayoga . introduced question categories using interrogatives. While questions in these categories are useful for extracting information and understanding knowledge, they are suggested to be limited in expanding new possibilities or exhibiting creativity. Therefore, it is implied that progressing to research questions is challenging. The current study developed a quality taxonomy of specific questions based on deficiencies in three types of logical reasoning to support the cultivation of the ability to question logical inconsistencies, which is essential for setting research questions. Using this framework, educators can design strategies to prompt learners to generate specific questions probing logical When designing strategies, it is crucial to clearly demonstrate the categories of specific questions, their definitions, and detailed examples based on the learnersAo characteristics. Furthermore, the results of this study provide a method for educators to objectively evaluate learnersAo specific questions in the actual support process, objectively assess learnersAo proficiency in posing specific questions, and more effectively guide the support process. This is expected to enhance learnersAo questioning ability and foster the formation of research Although the quality of the specific question taxonomy was confirmed using questions generated by course participants, three main challenges arose. First, the question sample was limited to a single course, making generalization difficult. Second, the evaluations were confined to raters within the same community, with no assessment of usability by users. Third, not all questions were consistently assigned, and confusion among categories within the same reasoning method was observed. Future research should apply the taxonomy to courses across various scientific disciplines, collect questions probing logical reasoning deficiencies, and further validate the quality of the taxonomy through assignments made by subject teachers to evaluate its usefulness. Assessing the taxonomyAos utility should consider attributes such as teachersAo experiences, and collect feedback from diverse teachers to identify potential improvements, thus meeting a broader range of user needs and enhancing its usefulness. Additionally, the assignment criteria among question categories using the same logical reasoning should be clarified to achieve more objective assessments. For instance, adding information emphasizing the sequential relationship between these question categories in the criteria could enable a deeper understanding of distinguishing categories when assigning questions. CONCLUSION This study successfully developed and validated a taxonomy of specific questions targeting deficiencies in three types of logical reasoning: inductive, deductive, and hypothetical. The taxonomy consists of nine categories, with three dedicated to each type of reasoning, providing a structured framework to support the cultivation of questioning skills essential for formulating research questions in scientific inquiry. The findings demonstrate the high quality of the taxonomy, as it effectively addresses gaps in reasoning and aligns with educational objectives to enhance inquiry-based learning. The validated taxonomy has significant implications for education, particularly in fostering students' critical thinking and inquiry skills. By providing a systematic approach to identify and address deficiencies in logical reasoning, the taxonomy can serve as a valuable tool for teachers to guide students in developing highquality research questions. This framework is especially relevant in inquiry-based learning environments, where the ability to question effectively is crucial. Further research could explore the practical application of this taxonomy in diverse educational contexts, including varying age groups, disciplines, and cultural settings. Additionally, qualitative studies could examine how students and educators interact with the taxonomy in realtime, highlighting challenges and potential improvements. Incorporating the taxonomy into teacher training programs could also enhance educators' ability to facilitate inquiry-based lessons. Overall, the taxonomy offers a In. Sci. Ed. Vol. No. January 2025: 6 - 14 In. Sci. Ed. ISSN: 2716-3725 promising foundation for advancing scientific inquiry skills and promoting deeper engagement with the reasoning process in education. ACKNOWLEDGEMENTS This study was supported by JSPS KAKENHI Grant Number JP20K20420. REFERENCES