International Journal of Management Science and Information Technology IJMSIT
E-ISSN: 2774-5694
P-ISSN: 2776-7388
Volume 6 .
January-June 2026, 366-376 DOI: https://doi.
org/10.
35870/ijmsit.
The Use of the K-Means Algorithm as a Method for Grouping Major Interests of Class X Students of SMK Satrya Budi 1 Commerce Handina Ananda Putri 1*.
Herman Saputra 2.
Yori Apridonal M 3 1*,2,3 Information Systems Study Program.
Faculty of Computer Science.
Universitas Royal.
Asahan Regency.
North Sumatra.
Indonesia Email: handinaanandaputri07@gmail.
com 1*, hermansaputra4@gmail.
com 2, yori.
apridonal@gmail.
Abstract Article history:
Received April 6, 2026 Revised April 16, 2026 Accepted April 18, 2026 In the modern era like today, education is one of the most important aspects in supporting the development of quality human resources.
addition, through the main discussion carried out in this study aims to be able to apply the K-Means algorithm as a method for grouping the interests of class X students of SMK Satrya Budi 1 Perdagangan systematically and based on data.
In addition, to be able to determine the level of compatibility between student interests and available majors based on the results of the grouping using the K-Means algorithm.
And in addition, for the Research Method section used in this study, namely a quantitative approach with the support of data analysis techniques using the K-Means Algorithm Method.
The selection of this method is based on the need for research to be able to produce objective grouping of class 10 student majors based on numerical data and certain relevant attributes.
based on this, this study shows the results that the application of the kmeans clustering algorithm is able to provide a more objective and systematic approach in the process of determining majors.
Grouping is done by processing several assessment criteria such as academic grades, aptitude tests, interest tests, entrance exams and basic skills.
Based on the final results of the grouping of 30 students, the system divides students into four main major groups, namely Motorcycle Engineering and Business .
Automotive Light Vehicle Engineering .
Heavy Equipment Engineering .
and Industrial Chemistry .
Keywords:
K-Means Algorithm.
Data Mining.
Data Clustering.
Major Interests.
Vocational High School Students.
INTRODUCTION
In the modern era, education is a crucial aspect in supporting the development of quality human Education is not only aimed at increasing academic knowledge, but also at equipping students with skills and competencies that meet the needs of the workforce (Darsono & Andrianti, 2.
As a medium for developing students' skills based on their interests and talents, vocational high school (SMK) education plays a strategic role in preparing students to enter the workforce and continue their education to a higher level (Dhamawan.
, 2.
At the vocational high school level, one of the important decisions students must make is choosing a major or program of expertise that aligns with their individual interests, talents, and potential.
This choice of major not only determines the academic path students will take, but also has a significant impact on their readiness to face the world of work and their ability to develop competencies relevant to current industry needs (Nasution & Saragih, 2.
Satya Budi 1 Private Vocational School in Perdagangan is one of the most popular vocational high schools in the Perdagangan sub-district.
With its strategic location and supportive educational environment.
Satya Budi 1 Private Vocational School is one of the schools that are usually chosen to be able to pursue Volume 6 .
January-June 2026, 366-376.
DOI: https://doi.
org/10.
35870/ijmsit.
vocational school for the surrounding community.
As one of the most popular schools.
Satya Budi 1 Private Vocational School certainly has a very large number of students.
This can happen because this school provides 4 .
majors/expertise programs, namely the automotive light vehicle engineering major (TKRO), the motorcycle engineering and business major (TBSM), the heavy equipment engineering major (TAB) and also the Industrial Chemistry major (KIN).
With the many choices of majors/expertise programs, the Satya Budi 1 Private Vocational School has a total number of students of around 756 students.
It is recorded that for the 2025-2026 academic year, the vocational school has a total of 290 new students consisting of 235 male students and 55 female students.
Even though this school has a very large number of students, when determining the major they want to choose, many students still make the wrong choice (Anwar et al.
, 2.
Although this school has been established for a long time, there are still several problems, including how to place students in majors or expertise programs that suit their interests, talents, and potential.
The process of determining the right major will not only affect students' learning motivation and academic achievement, but also impact their readiness to face the professional world after graduation (Permatasari & Tundjungsari, 2.
So far, the process of selecting majors at Satrya Budi 1 Private Vocational School is still done manually and is subjective.
Where teachers or the school usually rely on direct observation and simple interviews to find out the interests of their students (Saputra et al.
, 2.
In addition to the manual and subjective nature of the major selection process, there are also various other problems, such as identifying and accurately categorizing students' interests and potential.
This problem stems from the limitations of the assessment methods used to determine the suitability of students' interests to the available majors.
In the context of SMK Satrya Budi 1 Perdagangan, most of the interest identification process still relies on the perceptions of guidance and counseling (BK) teachers and the results of simple interviews without being supported by comprehensive data analysis.
As a result, many students do not fully understand their potential, leading to a mismatch between their personal interests and the chosen major (Rofiq & Qoiriah, 2.
Another problem arises from the students themselves.
Many tenth-grade students lack a clear understanding of their interests and potential.
They tend to choose majors based on the influence of their peers, parents, or perceptions of job prospects, without considering their abilities and suitability to their personal capabilities.
As a result, after some time in their chosen major, some students begin to lose motivation to study because they feel it is not a good fit for their chosen field (Asmana et al.
, 2.
On the other hand, there has been a case experienced by one of the students when he was initially registered in the Automotive Light Vehicle Engineering (TKRO) department when he first entered grade 10.
The decision to choose this major did not come entirely from his personal interests, but rather due to the encouragement of his parents and the influence of his friends, most of whom also chose the same major.
Initially, the student tried to adapt to learning activities that focused heavily on the automotive field, such as vehicle engine maintenance, car electrical systems and workshop practice.
However, as time went by, the student began to feel a mismatch between his personal interests and the chosen major.
He felt less interested in workshop activities and engine analysis and also had difficulty understanding technical concepts related to light vehicle machining.
His learning motivation also decreased, which was marked by a decrease in practical and theoretical grades.
After an evaluation by the homeroom teacher and guidance counselor, it was discovered that the student actually had a stronger interest in the field of heavy equipment mechanization.
Based on the description of problems and case studies that have occurred, this is where the role of technology can provide significant solutions.
With advances in information technology and the application of data-based methods, schools now have the opportunity to utilize technology as a tool in managing and analyzing student data more objectively (Ratih Yulia Hayuningtyas, 2.
Data regarding academic grades, talent assessment test results, interest assessment test results, entrance exam test results and basic skills assessment test results can be collected, processed and analyzed using various data analysis methods.
The result is more accurate information that can be used as a basis for decision making in placing students in appropriate majors (Rusdianto et al.
, 2.
One technology that can be applied is the clustering algorithm, specifically the K-Means algorithm.
The K-Means algorithm is a popular method in the fields of data mining and machine learning, especially for grouping or clustering tasks.
This algorithm works on the principle of dividing a set of data into several groups .
based on the similarity of certain attributes.
Each group will have a centroid, which is a center point that represents the average characteristics of the group members.
Data with the highest similarity will be placed in the same group, while differences between groups will be kept significant (Rosadi et al.
By using the K-Means Algorithm method as a grouping method, the school can see the overall pattern of students' interests and talents, which can be used as a basis for recommendations for appropriate majors.
This approach is much more systematic than manual methods which are often subjective and only rely on teacher intuition or the counselor's personal experience.
On the other hand, by applying the K-Means algorithm, it is hoped that it can provide solutions to problems in determining students' interests and majors more accurately and objectively (Fahmi et al.
, 2.
From various problems that occur in the field, such as difficulties in determining appropriate majors and the lack of systematic analysis tools, the authors in this Volume 6 .
January-June 2026, 366-376.
DOI: https://doi.
org/10.
35870/ijmsit.
study concluded to take the research title, namely " The Use of the K-Means Algorithm as a Method for Grouping Major Interests of Class X Students of SMK Satrya Budi 1 Perdagangan".
RESEARCH METHOD
The method used in this study is a quantitative approach supported by data analysis techniques using the K-Means Algorithm Method.
The selection of this method is based on the research need to be able to produce objective grouping of 10th grade students' majors based on numerical data and certain relevant attributes.
Thus, the research can produce decisions on major groupings that are measurable, logical and in accordance with the characteristics of each student.
In this study also uses a quantitative approach because the entire analysis process involves processing numerical data such as academic grades, interest test scores, aptitude test scores, entrance test scores and basic skills assessments.
This approach allows researchers to obtain more structured results and can be analyzed mathematically (Misbakhul Anam et al.
, 2.
And based on this, in this study the research framework used includes the following description and explanation.
Identification of Problems This stage is the initial stage of the research, where the researcher must first determine what problem will be taken for research, for SMK Swasta Satrya Budi 1 Perdagangan itself, the researcher conducted observations on the process of selecting student majors which is still done manually through direct observation and interviews conducted by teachers or counselors with students.
Although the school provides guidance and counseling services, the process does not utilize student data (Alawiyah et al.
, 2.
Data Collection The stages of data collection using interview, observation and questionnaire methods.
The interviews conducted were by asking things that the researcher felt were the main problems currently being faced by the school, the observations conducted by the researcher were to directly observe how the process of determining majors had been running and how the concept of the guidance and counseling system had been implemented by the school guidance and counseling department and at this questionnaire stage the researcher gave a number of questions through the Google Form Platform which was given to the 10th grade students from each major (Amelia et al.
, 2.
Research Data Set The dataset in this study was compiled based on student grade data obtained from several important assessment aspects.
Based on the table above, each student data has attributes in the form of academic grades, talents, interests, entrance exams, and skills.
These attributes are used as the main variables in the grouping process because they reflect the abilities and tendencies of students' interests in certain majors.
The amount of data used includes dozens of students with varying grades, thus providing a fairly representative picture.
The data was then processed using the K-Means algorithm to form several groups .
of students with similar characteristics.
The results of this grouping are expected to be able to help schools in determining the appropriate major for each student more objectively, based on the pattern of grades they have, rather than just based on subjective assessments alone (Nurarofah et al.
, 2.
Table 1Cluster Count Value Data Table Full name Academic Talent Interest Entrance examination Alfino Siregar Aditia Saputra Adjie al fikri Aga Tri Wibowo Agung Kurniawan Ahmad zain abiyyu nst Akbar Maulana Makmur Sirait Akhri Rizal Rizky Ananda Aditya Pratama Arief Ramadhan Arya Lesmana Azril Arizki Baimsyah Ramadani Bayu Gustiawan Chairul Fiqri Coal Danu Kurniawan Dedek Pratama Denni Admaja Skills Volume 6 .
January-June 2026, 366-376.
DOI: https://doi.
org/10.
35870/ijmsit.
Dhafin Septian Algadi Dimas Arya Singgih Dimas Panjaitan Dio alfansya Dirga Febriansyah Hutagaol Duan parizi Fadly Syahputra Fahri Fatah Khiyar Fery Febriansyah Fitra Ramadhan Gabriel Jericho Simatupang Galu Setiawan Research Variables The research variables in this study consist of several indicators used to represent student The main variables used are academic grades, talents, interests, entrance exams, and skills.
Academic grades reflect students' abilities in formal learning aspects, while talents describe the natural potential possessed by students.
Interest variables indicate students' tendency to be interested in certain fields, which is an important factor in determining majors.
Entrance exam scores are used as an initial indicator of students' basic abilities when accepted into school.
Meanwhile, skills indicate students' practical abilities in applying knowledge.
All of these variables are numeric and are used as the basis for the clustering process using the K-Means algorithm to group students into several categories of appropriate major interests based on similarities in their characteristics (Amanda et al.
, 2.
Analysis Stages The analysis stage of this study begins with the student data collection process, which includes academic grades, talents, interests, entrance exams, and skills.
Data preprocessing is then carried out, such as checking data completeness and normalizing values so that each variable has a balanced scale.
After that, the number of clusters .
that will be used as the basis for grouping is determined.
The core process is carried out by applying the K-Means algorithm, namely by determining the initial centroid, calculating the distance of each data to the centroid, then grouping the data based on the closest distance.
This step is repeated until the centroid position is stable.
The final stage is the interpretation of the cluster results to determine the characteristics of each student group.
The results of this analysis are used as a basis for providing recommendations for determining majors that suit the students' potential and interests (Lillah et al.
, 2.
Utilization of the K-Means Clustering The K-Means algorithm is one of the most widely used unsupervised learning techniques in data grouping or clustering.
This algorithm works by grouping data into several clusters based on the level of similarity in characteristics between the data.
Clustering is one of the main techniques in the field of data mining and machine learning that functions to group a set of data into several groups or clusters based on the level of similarity in characteristics between the data (Dhewayani et al.
, 2.
The following is a description of the method:
Determining the Number of Cluster In determining the centroid value for the initial centroid value, it is done randomly, whereas if determining the centroid value which is a stage of interaction, the following formula is used:
Calculate the distance between the centroid point and each object point Grouping each object to determine cluster members is done by calculating the minimum distance of the Continue to stage 2, repeat the process until the resulting centroid value remains constant and cluster members do not move to other clusters (Dhewayani et al.
, 2.
RESULTS AND DISCUSSION
The results and discussion section of this study presents the results of data processing that has been carried out using the K-Means algorithm to group the interests of class X students' majors at SMK Satrya Budi 1 Perdagangan.
This stage aims to show how the data that has been collected from students can be analyzed so as to produce more structured and easy-to-understand information.
Through this data processing process, researchers attempt to obtain an overview of the tendencies of student interests in the majors Volume 6 .
January-June 2026, 366-376.
DOI: https://doi.
org/10.
35870/ijmsit.
available at the school (Candra et al.
, 2.
The following is an explanation of the steps for completing the Linear Regression Method.
Data Processing Data processing in this study was carried out systematically to produce data ready for analysis.
The initial stage began with collecting student grades, covering academic aspects, talents, interests, entrance exams, and skills.
After that, a data selection and cleaning process was carried out to ensure there were no blanks, duplicates, or inconsistencies.
Next, the data was normalized so that all variables were on a comparable scale, preventing any single attribute from being more dominant in the distance calculation.
The cleaned data was then processed using the K-Means algorithm by determining the desired number of clusters.
This process involves calculating distances between data and repeatedly updating centroids until stable clusters are obtained.
The final result of the data processing is a grouping of students with similar characteristics as a basis for determining major interests.
Utilization of the K-Means Clustering To improve the accuracy of grouping student major interests, an approach capable of processing data objectively and systematically is required.
One method that can be utilized is the K-Means Clustering algorithm, known as a technique for grouping data based on similarities in certain characteristics.
By utilizing relevant data such as academic grades, aptitude assessment tests, interest assessment tests, entrance exam tests and basic skills assessment tests, the K-Means algorithm is able to divide students into several groups that have similar characteristics.
And the following is a description of its use:
The first step is to determine the number of clusters (K).
At this stage, researchers determine how many groups will be formed according to the analysis needs.
The second step is determining the initial centroid.
The centroid is the center point of each cluster.
the initial stage, centroids are usually determined randomly from available data or based on specific values considered representative of each cluster.
Table 1Initial Centroid Value Data The third step is to calculate the distance between each data point and the centroid.
The system calculates the distance between each data point and all existing centroids.
This distance calculation aims to determine the proximity of the data to each cluster.
Generally, the distance used is the Euclidean Table 2Centroid C1 Value Data Volume 6 .
January-June 2026, 366-376.
DOI: https://doi.
org/10.
35870/ijmsit.
The fourth step is iteration.
The process of calculating distances, grouping data, and updating centroids will be repeated until the centroid positions no longer change or the changes are very small.
This indicates that the clustering process has reached convergence.
100,000
289,000
16,000
484,000
49,000
4,000
196,000
81,000
25,000
169,000
0,000
324,000
144,000
256,000
64,000
441,000
225,000
100,000
9,000
36,000
361,000
196,000
49,000
121,000
16,000
64,000
169,000
81,000
1,000
400,000
144,000
100,000
25,000
256,000
64,000
4,000
196,000
81,000
16,000
169,000
0,000
289,000
64,000
225,000
49,000
400,000
144,000
100,000
9,000
25,000
324,000
121,000
36,000
64,000
9,000
25,000
144,000
81,000
1,000
289,000
Table 3Iteration Value Data
121,000
81,000
100,000
196,000
225,000
64,000
64,000
49,000
25,000
361,000
324,000
256,000
49,000
100,000
9,000
16,000
16,000
16,000
144,000
169,000
144,000
81,000
81,000
81,000
9,000
64,000
4,000
256,000
225,000
169,000
0,000
0,000
0,000
225,000
441,000
196,000
100,000
196,000
36,000
289,000
256,000
121,000
81,000
64,000
64,000
361,000
400,000
225,000
144,000
289,000
81,000
100,000
100,000
49,000
9,000
9,000
9,000
49,000
36,000
16,000
289,000
484,000
196,000
196,000
169,000
121,000
64,000
49,000
49,000
121,000
100,000
64,000
4,000
9,000
9,000
25,000
100,000
4,000
121,000
256,000
64,000
81,000
81,000
81,000
1,000
1,000
1,000
400,000
361,000
196,000
Amount
546,000
874,000
179,000
1681,000
271,000
56,000
849,000
405,000
118,000
988,000
0,000
1475,000
540,000
1147,000
322,000
1827,000
883,000
449,000
45,000
162,000
1654,000
803,000
247,000
470,000
47,000
218,000
754,000
405,000
5,000
1646,000
C1 (Roo.
23,367 29,563 13,379 41,000 16,462 7,483 29,138 20,125 10,863 31,432
0,000
38,406
23,238
33,867
17,944
42,743
29,715
21,190
6,708
12,728
40,669
28,337
15,716
21,679
6,856
14,765
27,459
20,125
2,236
40,571
The fifth step is iteration.
The process of calculating distances, grouping data, and updating centroids will be repeated until the centroid positions no longer change or the changes are very small.
This indicates that the clustering process has reached convergence.
23,367
29,563
13,379
41,000
16,462
7,483
29,138
20,125
10,863
31,432
0,000
38,406
73,163
68,214
80,811
63,032
78,180
85,902
69,220
75,345
82,774
68,334
93,032
64,758
Table 4Iteration Value Data
51,980
55,408
44,560
52,681
63,865
63,335
32,458
52,385
60,581
60,368
71,962
70,033
45,137
52,644
55,668
57,307
67,981
65,862
41,843
52,371
83,039
80,946
34,807
51,908
Average value Volume 6 .
January-June 2026, 366-376.
DOI: https://doi.
org/10.
35870/ijmsit.
23,238
33,867
17,944
42,743
29,715
21,190
6,708
12,728
40,669
28,337
15,716
21,679
6,856
14,765
27,459
20,125
2,236
40,571
72,751
66,751
76,989
63,212
68,607
74,510
86,604
81,226
64,027
69,572
78,758
73,997
86,417
79,314
70,176
75,345
90,841
63,803
51,796
39,524
58,231
30,506
44,269
54,452
73,481
65,167
32,423
45,634
61,107
53,611
73,563
62,976
46,878
55,668
79,810
32,567
55,154
52,001
59,077
52,762
52,400
56,663
71,297
64,216
52,358
53,091
61,058
56,353
71,120
61,854
53,085
57,307
77,567
52,367
Grouping Results The results section on the grouping of student majors in this study begins with a general overview of the analysis process conducted using the K-Means algorithm.
This stage is important because it serves as a bridge between the data processing process and the interpretation of the results obtained.
Through an average-based approach, previously heterogeneous student data is processed into more structured groups, thus providing a clearer and more objective picture of the pattern of major interests.
This process not only considers the numerical aspect but also attempts to represent the tendencies of students' academic abilities as reflected in their average scores.
The following grouping results are outlined in the table 6.
Table 5.
Data Grouping of Majors Based on Average Grades Student Name Class Suitable Major Alfino Siregar Teknik Alat Berat Aditia Saputra Teknik dan Bisnis Sepeda Motor Adjie Al Fikri Kimia Industri Aga Tri Wibowo Teknik Alat Berat Agung Kurniawan Teknik dan Bisnis Sepeda Motor Ahmad Zain Abiyyun NST Kimia Industri Akbar Maulana Makmur Sirait Teknik Alat Berat Akhri Rizal Rizky Teknik dan Bisnis Sepeda Motor Ananda Aditya Pratama Kimia Industri Andika Pratama Teknik Alat Berat Angga Saputra Teknik Alat Berat Ardiansyah Teknik dan Bisnis Sepeda Motor Arif Rahman Kimia Industri Arifin Teknik dan Bisnis Sepeda Motor Bagas Pratama Kimia Industri Bima Sakti Teknik Alat Berat Daffa Ramadhan Kimia Industri Dimas Saputra Teknik Alat Berat Egi Pratama Teknik dan Bisnis Sepeda Motor Fajar Nugroho Teknik Alat Berat Fikri Maulana Teknik Alat Berat Gilang Ramadhan Kimia Industri Hafiz Alfarizi Teknik dan Bisnis Sepeda Motor Ihsan Maulana Teknik Alat Berat Ilham Prasetya Kimia Industri Irvan Hidayat Kimia Industri Kurniawan Kimia Industri Rizki Pratama Teknik dan Bisnis Sepeda Motor Yusuf Teknik Kendaraan Ringan Otomotif Nanda Saputra Teknik Kendaraan Ringan Otomotif Average value Volume 6 .
January-June 2026, 366-376.
DOI: https://doi.
org/10.
35870/ijmsit.
Evaluation of Grouping Through the Elbow Method In the clustering evaluation stage of this study, the Elbow method was used as an approach to determine the most optimal number of clusters .
in the application of the K-Means algorithm.
This evaluation is an important part to ensure that the results of the grouping of interests of class X students of SMK Satrya Budi 1 Perdagangan truly represent naturally formed data patterns.
Based on the processed student data, which includes various major choices such as Heavy Equipment Engineering.
Motorcycle Engineering and Business.
Automotive Light Vehicle Engineering, and Industrial Chemistry, the Elbow method helps identify the point were increasing the number of clusters no longer provides a significant decrease in error values.
System Implementation The system implementation in this study was carried out by building an application capable of automatically processing student data.
The system is designed to accept input in the form of academic grades, talents, interests, entrance exams, and skills, which are then stored in a database.
Next, the system runs a calculation process using the K-Means algorithm to group students into several clusters based on similar The grouping results are displayed in an easy-to-understand format, allowing schools to see recommendations for appropriate majors for each student.
Furthermore, the system also supports the testing process by entering new student data to obtain classification results quickly.
With this implementation, the process of determining majors becomes more effective, objective, and structured compared to manual Login Page View The user login page display is the website page display that will be accessed by the user to enter the main page of the website, where the user will carry out the account verification process to log in to the main page display of the website.
Figure 1.
Login Page Display K-Means Clustering Calculation Process Data Detail Page View The K-Means Clustering Calculation Process Data Detail Page is a website page that users can access to view the results of the K-Means Clustering method calculation process that has been run on the system based on the assessment that has been carried out.
Volume 6 .
January-June 2026, 366-376.
DOI: https://doi.
org/10.
35870/ijmsit.
Figure 2.
Detail Page View of Calculation Process Data Details K-Means Clustering CONCLUSION Based on the results of research on the use of the K-Means algorithm as a method for grouping the interests of class X students of SMK Satrya Budi 1 Perdagangan, it can be concluded that this method is able to provide a more objective and systematic approach in the process of determining majors.
Grouping is done by processing several assessment criteria such as academic grades, aptitude tests, interest tests, entrance exams and basic skills.
Through an iterative process in determining the centroid and calculating the proximity distance between data, the system successfully forms student clusters based on the similarity of the characteristics of the values they have, thus producing more measurable major recommendations compared to manual methods.
Based on the final results of the grouping of 30 students, the system divides students into four main major groups, namely Motorcycle Engineering and Business .
Automotive Light Vehicle Engineering .
Heavy Equipment Engineering .
, and Industrial Chemistry .
The largest number of students are in the Industrial Chemistry major, which indicates that most students have a tendency to be able to analyze and process industrially.
Meanwhile, the Motorcycle Engineering and Business and Automotive Light Vehicle Engineering majors are filled by students with more dominant value characteristics in automotive technical abilities and the Heavy Equipment Engineering major is filled by students with a tendency towards strong technical abilities and field work readiness.
REFERENCES