SINERGI Vol. No. October 2023: 433-442 http://publikasi. id/index. php/sinergi http://doi. org/10. 22441/sinergi. Service quality dealer identification: the optimization of KMeans clustering Yolanda Enza Wella1. Okfalisa Okfalisa1,*. Fitri Insani1. Faisal Saeed2. Ab Razak Che Hussin3 Informatics Engineering Department. Universitas Islam Negeri Sultan Syarif Kasim Riau. Indonesia School of Computing and Digital Technology. Birmingham City University. United Kingdom Information Systems Department. Universiti Teknologi Malaysia. Malaysia Abstract Service quality and customer satisfaction directly influence company branding, reputation and customer loyalty. As a liaison between producers and consumers, dealers must preserve valuable consumer relationships to increase customer satisfaction and Lack standardization regarding service quality emerges as a consideration issue towards the company service excellence. Therefore, identifying the service quality performance and grouping develops into valuable contributions in decision-making to control and enhance the company's intention. This study applies the K-Means Algorithm by optimizing the number of clusters in identifying dealer service quality Hence, the ultimate service quality formation will be The analysis found three dealer identification categories, including Cluster One, with 125 dealers grouped as good Cluster Two, with 30 dealers grouped as very good and Cluster Three, with 38 dealers grouped as not good performance. In order to evaluate the efficacy of optimum k value, the lists of testing approaches are conducted and compared, whereby Calinski-Harabasz. Elbow. Silhouette Score, and DaviesBouldin Index (DBI) contribute in k=3. As a result, the optimum clusters are determined through the highest performance of k values as three. These three clusters have successfully identified the service quality level of dealers effectively and administered the company guidelines for corrective actions and improvements in customer service quality instead of the standardized normal distribution grouping calculation. Keywords: Algorithm Optimization. Calinski-Harabasz. Davies-Bouldin Index (DBI). Elbow Method. K-Mean Clustering. Service Quality Identification. Silhouette Score. Article History: Received: May 25, 2023 Revised: June 30, 2023 Accepted: July 19, 2023 Published: October 2, 2023 Corresponding Author: Okfalisa Okfalisa Informatics Engineering Department. Universitas Islam Negeri Sultan Syarif Kasim Riau. Indonesia Email: okfalisa@gmail. This is an open access article under the CC BY-SA license. INTRODUCTION Service quality is briefly defined as how a expectations . Service quality also regards the service customersAo experience usage value to advise the service managers to deeply focus on continuous quality management by facilitating user value creation . Service quality hands over the company's suggestion to improve service quality and increase customer satisfaction . Company Quality is closely related to customer To evaluate customer satisfaction, the company must have a performance evaluation system and performance measurement quality tools . This tool provides quality analysis measurement and company performance. Thus, customer satisfaction, company reputation and trust as the key antecedents of customer loyalty and purchase intention can be controlled and enhanced . Customer loyalty was the most important positive consequence of customer satisfaction . The above findings reveal a significant positive relationship between customer trust and loyalty. Meanwhile, customer satisfaction and perception of service quality highly contribute to indistinguishable loyalty predictors, directly influencing perceived value, corporate image. Yolanda et al. Service quality dealer identification: the optimization of K-Means clustering. SINERGI Vol. No. October 2023: 433-442 company reputation, and customer loyalty . Besides, customer satisfaction had a significant and sizeably positive effect on return intention . Dealers are companies or individuals engaged in the liaison between producers and In this study, the intended dealer is limited to a place to sell vehicles and provide consumer services such as maintenance and providers of vehicle products. To be sustained, the dealers need to maintain relationships with thus, it impacts the customer satisfaction demand. The main dealer controlling the service and maintenance system always scrutinizes whether each branch dealer has met the consumer service standards and qualities. Therefore, this study tries to identify the service quality in maintaining customer satisfaction as one of the company services to ensure that consumers obtain similar treatments at every dealer without any differences. Clustering, classification and prediction in various disciplines have developed and shown significant contributions to fields of knowledge, including medicine, finance, quality management, industry, technology, and even molecular biology and bioinformatics . K-Means is one of the most algorithms that solve clustering identification problems . The K-Means, an iterative clustering analysis algorithm, effectively solves and determines the complicated number of clusters and initializes the central cluster sensitively . K-Means is a simple clustering algorithm, easy to be implemented, relatively fast operation and efficient. Regarding on the centroid and connectivity models. Karthikeyan et al. found the high performed of K-Means in terms of execution time and memory for small and large dataset rather than agglomerative hierarchical . Due to the low complexity of K-Means. Wiharto and Esti . identified the affection of faster computing compared to Fuzzy CM especially for imaginary detection process . Zhou and Yang . resumed that for data sets with significant uneven distributions, k-means clustering is a better choice compared with FCM clustering . Unfortunately, several weaknesses were found in the K-Mean localized optimization technique: sensitive selection of the cluster midpoints, thus impacting the high errors and poor clustering results. Previous studies have examined the optimizing number of clusters by KMean, thus improving the clustering performance. A noise-based K-Means clustering algorithm has been effectively proposed for the problems solving schema for K-Mean clustering limitation in difficulty determining the number of clustering and the initialization sensitivity of the clustering center . Adopting the Elbow testing approach on KMeans contributes to the subjective and optimal estimating numbers of dataset analysis. The clear and unclear elbows' position on the line graph impacts the estimation of optimal numbers of clusters, whether in the high probability or fails to work properly . Besides the Elbow testing approach. Sum Square of Error (SSE). Runtime, silhouette coefficient (SC). CalinskiAeHarabasz index (CHI), and DaviesAeBouldin index (DBI) are commonly applied to evaluate the performance of K-Mean clustering, for examples Chang et al. analyze the university studentsAo behavior based on a fusion K-Means clustering algorithm. They found the effect and operating efficiency of K-Mean clustering through the running time and values of SC. CHI. DBI, and SSE . Moreover, the Davies Bouldin index has successfully identified the optimal number of clusters in measuring the performance of automatic data clustering using hybrid chaos optimization algorithm . Uday et al. reveals k-means clustering algorithm can be significantly improved by using a better initialization technique such as DaviesAeBouldin Index (DBI) . To support this. Yuan . K-means algorithm for global earthquake catalogs and earthquake magnitude prediction. The sum of squares error. DaviesAeBouldin index. CalinskiAe Harabasz index, and silhouette coefficient are applied to determine the number of clusters. Huiling . have successfully determine the optimal number of clusters with Improved k value based on the Elbow Rule . Viloria and Lezama . enhanced the efficacy of the Automatic Clustering using Differential Evolution (ACDE) approach using the U Control Chart (UCC). The results show that the proposed method yields excellent performance compared to prior research for most datasets with optimal cluster number yet lowest DBI and Cosine Similarity (CS) measure. Thus, the advancement of this approach can determine the k activation threshold in ACDE that caused effective determination of the cluster number for k-means clustering . Jun et. applied the elbow method which is generally used to determine the best k value. The relationship curve between SSE and k is angled, and the value of k that corresponds to this angle becomes the actual cluster numbcser of the data . These above studies trigger the application of K-Means for the classification of dealer service identification case with deep analysis evaluation . Calinski-Harabasz. Elbow. Silhouette Score. Yolanda et al. Service quality dealer identification: the optimization of K-Means clustering. p-ISSN: 1410-2331 e-ISSN: 2460-1217 and DBI in order to enhance the efficacy and performance of K-Means clustering. METHOD Data Collection Data was collected through thorough literature studies and several interviews at dealersAo Network Operational Standards section by a Network Caring Development Officer and the head of the assessor's team. The interviewees agree that identifying service quality significantly improves the company's quality and brand vision. Therefore, starting in 2019, the company has been conducting the performance measurement qualification using a separate section of services Call Survey (CS). The assessment uses telephone calls to appraise the service providers compared to actual service users . Mystery Call (MC) assesses the customer service staff's work and services, starting from answering the customer calls up to ending calls, viz. , dealer operation, hospitality, product knowledge, and selling skills . Mystery Shoppers (MS), commonly called anonymous, silent, or secret, visit service points or shops, pretend to be regular customers, observe the service delivery process, and, immediately after the service interaction, record their observations on various aspects of the service experience on a detailed questionnaire, such as exterior, interior, material promotions, parking area, attribute interior, unit display, riding test unit, negotiation and dealing, customer facilities, ending buying, payment, customer services uniform, security, greeter, sales counter, booking services, service advisor area, front desk area, waiting room, mechanic area, final confirmation, front desk officer, admin Customer Relationship Management (CRM), service adviser, cashier, mechanics, and parts counter . , 28Ae. Result in the audit of Network operations standards (NOS), as an assessment technique that evaluates and monitors the entire aspects of customer service quality carried out by a team of assessors . The service assessment above is then manually calculated using the company standard Hence, it transforms into five service quality grading result classification using the standard normal distribution scale viz. Bronze . 9%). Silver . 9%). Gold A . 9%), Gold B . 9%), and Platinum . -100%). The measurement of dealer service quality is then performed and mapped per service quality assessment section with no resume calculation for the entire sections, for example Dealer 1 is measured as Platinum for Call Survey (CS). Gold B for Mystery Shoppers (MS). Gold A for Mystery Call (MC), and Silver for NOS. Herein, we resumed the above assessment results and transformed them into one dataset to identify the quality of dealer service in the data mining platform with selected attributes, including dealer name. CS. MC. MS, and NOS values from semesters 1 to 4. The list of Knowledge Discovery Data (KDD) stages is conducted, viz. pre-processing, and transformation and . As a result, several necessary features with various data types from the interviews and services assessments are obtained and cleaned from 205 into 193 data. Then, this data is ready to be analyzed with K-Mean clustering mining. The stages process of K-Mean clustering mining can be depicted in Figure 1. K-Means Clustering The k-Means algorithm is a simple iterative clustering algorithm that uses the metric distance and the k classes provided by the data set, for next to calculate the average distance and equip the initial center of the class described by the center of mass . , 27, . However, several drawbacks of K-Means are found, especially in data categorical, whereby the most frequent value represents the k-mode centroid and needs to be initially determined by the user. Determining the optimal number of clusters is challenging, especially for data sets with less prior knowledge. A reasonable percentage of the partition clustering algorithm must define the number of clusters as the input parameter before training randomly to find the optimized grouping results . , 17, . , and . Besides, this algorithm is highly sensitive to outliers. The K-Means steps are as follows . Determining the value of k. Calculating the distance from the data to the centroid using the Euclidean Distance formula ycc. cuycn , yuNycn ) = Oo. cuycn Oe yuNyc. Grouping the data based on minimum Updating the centroid value with the formula is as follows. yayco = Ocycc ycuyco ycn Where: nk = the amount of data in the cluster di = the sum of the values included in each Yolanda et al. Service quality dealer identification: the optimization of K-Means clustering. SINERGI Vol. No. October 2023: 433-442 Repeating the iteration until the cluster is Next. K-Mean mining has put a value using several testing approaches, including Elbow. Silhouette. Calinski-Harabasz Index, and DaviesBouldin Index (DBI). Elbow Method The Elbow Method is one of the most commonly used methods to distinguish the optimal number of clusters through the elbow points identification on the curve visualization . Herein. SSE for each cluster is calculated and compared to optimize the number of clusters . The elbow method goes through the graph of the excellent k value with the position on the elbow along with the SSE . ess than . The best kcluster results will be the basis for data grouping. The smaller the SSEAos value and the angled graph, the better the cluster results . ycu ycIycIya = Oc. ccycn )2 ycn=1 Where: d = distance between data and cluster center Silhouette Method The Silhouette approach calculates the silhouette coefficient numbers combine with the separation and cohesion. A higher Silhouette coefficient is a better cluster provided . ycu ycII = yca. Oe yca. Oc( ycu ycoycaycu. , yca. } . ycn=1 Where: = the average distance of sample i to other samples in the cluster b. = minimum sample distance from sample i to other clusters Calinski-Harabasz Index (CHI) The Calinski-Harabasz Index (CHI) evaluates cluster validity based on the InterCluster Sum of Squares (BSS) and Within-groups sum of squares (WSS) calculations. CHI measures the separation ratio based on the maximum distance between the centroids and the compactness based on the sum of the distances between each data and the centroids. A compact and well-separated cluster configuration is expected to have high inter-cluster variance and relatively low intra-cluster variance . cA Oe y. a Oe . = . Where: ya yaA. = (Oc yco=1 ycayco Ouycuyco Oe ycu Ou ) ya ycO. = OIOc yco=1 Oc ya. =ya . Ouycuyc Oe ycuyco Ou ON . Where: K as the appropriate number of clusters. B(K) as inter-cluster divergence, also called inter-cluster covariance, and W(K) as intra-cluster divergence, also called intra-cluster covariance. Davies-Bouldin Index (DBI) One of the clustering validation techniques is the Davies Bouldin Index technique. This technique will identify the cohesion matrix . loseness of one grou. and the separation matrix . ifferences between group. The smaller, the better the resulting Davies Bouldin Index (DBI) value . yayaAya = Figure 1. Flow Process K-Means Mining yco Oc ycoycaycuycnOyc ycI. cn, y. ya ycn Where: ya = existing clusters ycIycn,yc = ratio between clusters i and j Yolanda et al. Service quality dealer identification: the optimization of K-Means clustering. p-ISSN: 1410-2331 e-ISSN: 2460-1217 mycaycu = find the ratio between the largest cluster. Mystery Shoppin. , and NOS (Result audit of NOS). A detailed sample of raw data can be depicted in Table 1. As a result of the K-Mean algorithm tracking process and calculation at . Table 2 is captured to show the clustering identification process from k=1 to 10. RESULTS AND DISCUSSION K-Means Clustering Following the pre-processing stages in K-Mean clustering, the 205 data is reduced into 193 raw thus, it was filtered into nine features and values calculation by service assessment, including CS1 (First semester Calls Surve. CS2 (Second semester Calls Surve. MC1 (First quarter Mystery Cal. MC2 (Second quarter Mystery Cal. MC3 (Third quarter Mystery Cal. MC4 (Fourth quarter Mystery Cal. MS1 (First quarter Mystery Shoppin. MS2 (Second quarter Evaluation of Optimum K-Values Ensuring the formula of the equation for Elbow. Silhouette, and HCI, the maximum index values are described in Table 3. Table 1. The Sample of Raw Data Set Values Nama Dealer CS1 CS2 MC1 MC2 MC3 MC4 MS1 MS2 NOS Dealer 1 Dealer 2 Dealer 3 Dealer 4 Dealer 5 Dealer 191 Dealer 192 Dealer 193 Table 2. Cluster Identification Based on K-Value Nama Dealer Dealer 1 Dealer 2 Dealer 3 Dealer 4 Dealer 5 Dealer Dealer Dealer Yolanda et al. Service quality dealer identification: the optimization of K-Means clustering. SINERGI Vol. No. October 2023: 433-442 Table 3. The Comparison of k Evaluation Number of Clusters SSE CHI Silhouette coefficient DBI 145,12 0,325 1,405 100,04 0,366 1,038 76,44 0,229 1,459 65,88 0,244 1,320 58,73 0,265 1,180 51,74 0,270 1,211 47,09 0,285 1,302 44,45 0,240 1,351 41,89 0,214 1,478 Table 3 explains that the SSE value of the Elbow Graph for the third cluster is the most significant, with a k value of 100. The CHI index defines the optimum value of 85. 35 at the third Meanwhile, the Silhouette coefficient analyzes the optimum k=3 at 0. These real results are also supported by DBI's excellent value at k=3 with a value of 1. In a nutshell, these evaluations confirm the optimum cluster values at k=3. good performance. The mean analysis of the Dealer sAo performance per variable can be depicted in Figure 3. Clustering Result Proceeding the K-Mean calculation with optimum identification of k value at 3. Table 4 and Figure 2 are defined. Herein, dealers' data are grouped into three clusters: cluster 1 for 38 data, cluster 2 and 3 for 125 and 30 data, respectively. The mapping cluster analysis for clusters 1, 2 and 3 are explained in Table 5. Whereby cluster 1 is good, cluster 2 is very good, and cluster 3 is not a Figure 2. K-Means Clustering Map for k=3 Table 4. Cluster Identification for k=3 Dealer Name Dealer 1 CS1 CS2 MC1 MC2 MC3 MC4 MS1 MS2 NOS CLUSTER Dealer 2 Dealer 3 Dealer 4 Dealer 5 Dealer 191 Dealer 192 Dealer 193 Yolanda et al. Service quality dealer identification: the optimization of K-Means clustering. p-ISSN: 1410-2331 e-ISSN: 2460-1217 CS1 CS2 MC1 MC2 MC3 MC4 MS1 MS2 NOS Figure 3. Dealer Performance Analysis per Variables Table 5. Cluster Mapping Analysis for k=3 Dealer Name CS1 CS2 MC1 MC2 MC3 MC4 MS1 MS2 NOS CLUSTER Dealer 1 Dealer 2 Dealer 3 Dealer 4 Dealer 5 Dealer 191 Dealer 192 Dealer 193 Figure 3 describes that, generally, the performance of CS1 and CS2 point out at Cluster 1 and 2 (Good and very good performanc. This indicates that the customers are satisfied with the service providersAo assistance with products, features, promotions, communities, vehicle services explanation, and the flexibility of the Meanwhile, the performance of MC. MS, and NOS is still in the balance between good/very good and not good. This specifies that the dealers should pay more attention to the service quality at dealer operation, hospitality, product knowledge, and selling skill. Herein, this study reveals the dealers' clustering analysis by considering CS1. CS2. MC. MS, and NOS performance variables as the analysis methods of service quality measurement at points of sale or customer Besides, the study found that MS or Mystery Shoppers and NOS (Network operations standard. audit performs a similar cluster result, as presented in Figure 3. This bears out the statement of . that MS is one of the most widely used tools to monitor the quality of service and personal selling and an effective way of testing service provision and predicting customer satisfaction and sales performance. MS technique to organizations on the communicative competencies of call-takers. report is based on a seriously constrained notion of their regular activity on the phone and the range of contingencies they must deal with . Meanwhile, comparing to the previous company measurement with five grading result calculation, this K-means classification with three optimum classes proposes high valuable data measurement per dealerAos service assessment variable and general performance. Thus, the quality measurement can be enhanced and more Yolanda et al. Service quality dealer identification: the optimization of K-Means clustering. SINERGI Vol. No. October 2023: 433-442 CONCLUSION This research has successfully identified the dealer service quality. Thus, the company can maintain customer satisfaction by intensifying the dealer service performance. The optimum number of quality performance clusters is proposed by adopting the K-Means algorithm with four evaluation techniques: Elbow Method. Silhouette Score. Davies-Bouldin Index (DBI), and the Calinski-Harabasz index. As a result, k values at three are the well-separated and precise distribution shapes in grouping the dealer quality service performance into cluster one for good, cluster two and three for very good and not good performance, respectively. This dealer grouping proposes novel and auspicious service quality performance schema comprehensively instead of mathematical normal distribution standard. Besides the dealer quality service clustering, this result also shows that Mystery Shoppers (MS) is an effective tool in monitoring and testing service provision and predicting customer satisfaction and sales performance. Therefore, these proposed results can be used as a guideline for the management level in improving performance measurement tools and evaluation schema, dealer performance and sales service quality. ACKNOWLEDGMENT This research was supported by Universitas Islam Negeri Sultan Syarif Kasim Riau. Pekanbaru. Indonesia and Honda Motorbike Dealer at Pekanbaru. Indonesia, for data We also thank our colleagues Lestari Handayani and Yusra from Universitas Islam Negeri Sultan Syarif Kasim Riau, who provided insights and expertise that greatly assisted and enriched this research. REFERENCES