IPTEK The Journal of Technology and Science, 35. , 2024 . /pISSN:2088-2033, 0853-4. DOI: 10. 12962/j20882033. Received 21 Dec, 2023. Revised 3 Jan, 2024. Accepted 13 Jan, 2024 ORIGINAL RESEARCH PATIENT SEGMENTATION BASED ON CUSTOMER LIFETIME VALUE ANALYSIS USING RECENCY, FREQUENCY. MONETARY AND INTERPURCHASE TIME MODELS Mohamad Shodikin1 | Chastine Fatichah*1 | Nungky Taniasari2 1 Informatics Departement. Institut Teknologi Sepuluh Nopember. Surabaya. Indonesia Abstract 2 Faculty of Health Sciences. Universitas Anwar Medika. Sidoarjo. Indonesia Hospitals strategically provide quality health services to the surrounding community. One of the strategic roles is realized in inpatient and outpatient services. Based on Correspondence *Chastine Fatichah. Dept of Informatics . Institut Teknologi Sepuluh Nopember. Surabaya. Indonesia. Email: chastine@if. the background above, the author examines patient segmentation based on Customer Lifetime Value analysis using the RFMT model. The research stages include calculating the RFMT score, clustering using the K-Means and DBSCAN algorithms, and patient segmentation based on CLV analysis. The dataset in this study was obtained from inpatient and outpatient visits at a hospital from January to December 2022. The Present Address research results show four segmentations of outpatients and four inpatients based on Gedung Teknik Informatika. Jl. Teknik Kimia. Kampus ITS Sukolilo. Surabaya 60111. Indonesia CLV values: Champions. Loyal Customers. Potential Loyalists, and Lost Customers. Inpatient segmentation includes the Champion category, 473 . %) patients belonging to 14 clusters . 56%). The Loyal Customer Category is 1,727 . %) patients who are members of 31 clusters . 44%). The Potential Loyalist category is 3,874 . %) patients who are members of 31 clusters . 44%). The Lost Customer Category was 18,516 . %) patients from 14 clusters . 56%). Outpatient segmentation includes: Champion category, 3,512 . %) patients belonging to 10 clusters . 10%). The Loyal Customer Category is 5,661 . %) patients who are members of 24 clusters . 24%). The Potential Loyalist category is 11,070 . %) patients who are members of 34 clusters . 34%). In the Lost Customer Category, 23,678 . %) patients belong to 31 clusters . 31%). KEYWORDS: Fuzzy AHP. CLV. Clusterization. Patient Segmentation. RFMT Shodikin ET AL. INTRODUCTION A hospital is an organization that has professional and organized medical personnel and permanent medical facilities providing medical services, continuous nursing care, diagnosis, and treatment of diseases suffered by patients . The hospital has a mission to provide comprehensive health services . reventive, promotive, curative, and rehabilitative ) that are quality and affordable to improve public health. Hospitals have a strategic role in providing quality health services for the surrounding One of these strategic roles is realized in inpatient and outpatient services. Understanding the characteristics of inpatients and outpatients based on demographic status, financial status preferences for types of services, and so on is necessary to determine the right marketing strategy to maintain patient visits. The above is known as customer segmentation. Customer segmentation can be done with demographic, geographic, behavioral, and psychological data on customers . Elrod et al. analyzed several demographic variables . , age, gender, incom. to identify important customers. Customer value for a company or organization is determined by the value of Customer Lifetime Value (CLV) during the customer life cycle . LV values help companies and organizations allocate the limited resources available to their customers by categorizing them and assigning a certain weight to each customer . Calculating the appropriate CLV value can help organizations categorize and classify their customers based on CLV ranking . CLV ranking is evaluated with one of the popular models, the RFM model, for emphasizing customers who benefit the company . Chen et al. examined customer segmentation based on the RFM model in the online retail industry. From the results of the customer grouping, you can predict which customers have the highest purchasing preferences. Previous studies that focused on the history of inpatient and outpatient medical services for patient segmentation have not been carried out much. So, this research proposes patient segmentation based on Customer Lifetime Value (CLV) analysis using the Recency. Frequency. Monetary, and Interpurchase Time (RFMT) model. This research is expected to be able to fill this gap. PREVIOUS RESEARCHES Customer lifetime value (CLV) is the value consumers feel when the costs are paid to obtain a product or service, where, in the end, consumers get an overall evaluation of the product or serviceAos effectiveness . CLV refers to the present value of all benefits created by consumers for the company in the entire process of maintaining a relationship with the company[ . ] . The future cash flow brought by consumers to the company and the maintenance level of consumer groups can be calculated according to the companyAos past transactions . In non-contractual relationships, consumers can conduct transactions with several companies simultaneously, and the costs to consumers of switching between several companies are very low. Currently, consumer lifetime value calculations focus on estimating the probability or frequency of a consumerAos future repeat purchases based on newness . Customer segmentation can be divided into four customer categories based on the CLV value, sequentially the level of customer loyalty based on the CLV index calculation: Champions. Loyal Customers. Potential Loyalists, and Lost Customers . A model called Recency. Frequency Analysis. And Monetary (RFM) was proposed by Hughes . to analyze customer behavior. Over the past few decades, the model has become a common paradigm for behavioral segmentation [ . The RFM model can characterize customer purchasing behavior by investigating the elapsed time from the last purchase (Recenc. , the number of purchases (Frequenc. , and the total expenditure (Monetar. of each customer in a certain period. The customer clustering model with RFM does not focus on attracting new customers but rather on identifying the best customers for marketing performance targets . Recency is the number of days since the last transaction. Frequency is the total number of transactions. Monetary is the total value of transactions during a certain period . To improve accuracy and obtain more information. Zhou et al. In 2021, a new variable purchase time (T) was added to calculate the time between transactions since the patientAos first transaction. Several new approaches have been extended from the RFM model. Yeh et al. extended the RFM model by adding Autime since first purchaseAy (T) and Auprobability of churnAy (C). Yoseph et al. introduced the rate of change in purchases (C) to indicate the quantity and sign of changes in customer purchasing behavior. Interpurchase Time (T), defined as the interval between two consecutive purchases by a customer in the same store or on the same website, has been frequently used to analyze shopping behavior since the mid-1960s . Vakratsas et al. used T to study the relationship between online shopping regularity and Similarly. Meyer Waarden . introduced T to investigate the impact of loyalty programs on customer purchasing Shodikin ET AL. Guo . developed a multi-category inter-purchase timing model to predict purchase likelihood for individual customers. Also. Junpeng Guo et al. utilized a multi-category inter-purchase time model to increase the effectiveness of product A low R-value indicates that consumers have consumed the companyAos products or services in a shorter time, so consumers are more likely to be touched by the companyAos marketing information. Thus, such consumers are more likely to make repeat purchases quickly, and companies can obtain greater revenues by investing fewer costs in these consumers . Meanwhile, if the R-value is large, consumers will not consume the companyAos products or services long. The F value refers to the frequency of customers purchasing a companyAos products or services in a statistical period. Generally, the greater the customerAos F value, the higher the frequency customers purchase the companyAos products or services, the higher the customer loyalty, and the greater the customer value . Customer purchase frequency is often considered in combination with customer purchase immediacy. When a customerAos purchasing frequency is high, the closer the purchase time is to now, the more likely the customer will continue to consume the companyAos products. When the time of purchase is far from now, such customers were once valuable customers for some time, but now they are likely to be lost. The M value refers to the total customer consumption of a companyAos products or services in a statistical time. In general, the higher the number of purchases of a companyAos products or services by customers, the lower the amount of consumption of alternative products or services, the more loyal the customer, and the higher the value . Researchers believe that RFM can well reflect customersAo historical consumption behavior, and a CLV value prediction model can be formed based on the three variables R. F, and M . The RFMT model assessment process is carried out for RFMT analysis using the quantile method. Each customer is given a quantile score with the highest frequency value being 5, while the other quantiles are given scores of 4, 3, 2, and 1 in rank order. This process is also carried out for monetary. The opposite is true for quantiles. The smallest recency and interpurchase time are given a score of 5, while the quantile Recency and other interpurchase times are respectively scored 4, 3, 2, and 1 . After RFMT score calculation, algorithm clustering can group RFMT data into certain clusters . Clustering is the process of grouping a set of data objects with similar values in data attributes into one group or cluster, and each cluster formed does not have similar values to other clusters . The purpose of clustering is to organize data into classes so that there is high and low intra-class similarity . The Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm can find high-density core samples and expand clusters from these samples. Two main parameters of the algorithm are determinedimum number of samples and epsilon. The first parameter determines the minimum number of points classified as a core sample. This parameter defines the noise tolerance level of the algorithm . The DBSCAN algorithm can find any cluster of any shape and effectively identify existing noise points . Meanwhile. K-Means is an algorithm that is easy to use to cluster large amounts of data and outlier data very quickly based on the level of similarity and dissimilarity of objects . K-Means Clustering can progressively improve the quality of clustering . From an optimization perspective, the main goal of clustering is to maximize homogeneity . within a cluster . and heterogeneity . between different clusters 1. In research by Rimelda et al. comparing the DBSCAN and K-Means algorithms for grouping countries in the world regarding recovery and recovery rates in handling the Covid-19 pandemic, the conclusion is that the K-Means algorithm is superior to DBSCAN with the best Silhouette Index value of 0. 6902 with a value of k = 8. Research conducted by Mustakim et al. regarding the application of the DBSCAN algorithm in clustering Twitter text quotes for the Pekanbaru City Election, a combination of Eps = 0. 1 and MinPts = 10 with the best Silhouette Index value METHOD The process diagram used to segment patients can be seen in Figure 1 . Based on Figure 1 , it shows that the research started from collecting data on outpatient and inpatient visits. The RFMT model assessment process is carried out for RFMT analysis using the quantile method. After calculating the RFMT score, the clustering algorithm can group RFMT data with the DBSCAN and K-Means algorithms. Next, the DBSCAN and K-Means algorithms Shodikin ET AL. FIGURE 1 Research method compare their quality using the Silhouette Index (SI) method to get the best clustering algorithm. After forming clusters on the inpatient and outpatient datasets, a data normalization process was carried out on the Recency (R). Frequency (F). Monetary (M), and Interpurchase Time (T) attributes using the Min-Max method for the CLV analysis process. Determination of CLV in this study uses normalized RFMT values using the min-max and weighted RFMT methods. Datasets This study uses two historical datasets of patient visits: inpatient visits and outpatient visits. The dataset was obtained from SIMRS Private Hospital X from January to December 2022. Based on Table 1 , it can be seen that the outpatient dataset consists of 10 features, and there are 185,196 visit data. The outpatient dataset shows every patient who visited the hospital to access outpatient clinic services. These are identified based on patient ID, visit ID, visit date, guarantor, billing, gender, patient age, and medical diagnosis. Age category data based on the Indonesian Ministry of Health . is divided into six groups, namely toddlers . - 5 year. , children . - 11 year. , teenagers . - 25 year. , adults . - 45 year. , elderly . - 65 year. , and seniors . - abov. The patientAos medical diagnosis data uses International Classification of Diseases (ICD) codes. Shodikin ET AL. TABLE 1 The features on outpatient services datasets. Feature Patient ID Visit ID Date of visit Destined clinic Service guarantor Service bill Gender Age group Primary medical diagnosis Secondary medical diagnosis Description A unique number that each patient has Number of patient visits to the hospital Date of patient visit to hospital The destination clinic service location How to pay or guarantee patient services Total service charge bill Patient gender Patient age group categories . oddlers, children, teenagers, adults, elderly, senior. ICD10 patientAos primary medical diagnosis ICD10 secondary medical diagnosis of the patient TABLE 2 Features on outpatient services datasets Feature Patient ID Visit ID Date of visit Service guarantor PatientClassantor class Service class Service bill Gender Age group Primary medical diagnosis Secondary medical diagnosis Length of hospitalization Description A unique number that each patient has Number of patient visits to the hospital Date of patient visit to hospital How to pay or guarantee patient services The class of service rights that the patient has at the time of servClass Class of service received by the patient during service Total service charge bill Patient gender Patient age group categories . oddlers, children, teenagers, adults, elderly, senior. ICD10 patientAos primary medical diagnosis ICD10 secondary medical diagnosis of the patient The patientAos length of stay in hospital Based on Table 2 , it can be seen that the inpatient dataset consists of 12 features, and there are 29,687 visit data. The inpatient dataset shows every patient who visits the hospital to access inpatient services identified based on patient ID, visit ID, visit date, guarantor, bill, guarantor class, service class, gender, patient age, length of stay,y, and medical diagnosis. Age category data based on the Indonesian Ministry of Health . is divided into six groups, namely toddlers . - 5 year. , children . - 11 year. , teenagers . - 25 year. , adults . - 45 year. , elderly . - 65 year. , and seniors . - abov. The patientAos medical diagnosis data uses International Classification of Diseases (ICD) codes. The outpatient research dataset consists of 10 features, with 185,196 visit data. Outpatients who had visited more than once were 25,101 patients or 57% of the total outpatients. The inpatient dataset consists of 11 features and 29,687 visit data. Inpatients who had visited more than once were 3,287 patients or 13% of the total inpatients. Age category data based on the Indonesian Ministry of Health . is divided into six groups, namely toddlers . - 5 year. , children . - 11 year. , teenagers . 25 year. , adults . - 45 year. , elderly . - 65 year. , and seniors . - abov. The patientAos medical diagnosis data uses International Classification of Diseases (ICD) codes. RFMT Analysis The patient medical record number attribute in the outpatient and inpatient datasets determines the RFMT value. The time interval from the last patient visit to the data collection date is calculated as the Recency (R) value. The number of patient visits is calculated as the Frequency (F) value. All service bills from the same patient are accumulated as a Monetary value (M). The fourth variable. Inter purchase Time (T), measures the average distance between consecutive visit patterns. Suppose the first customer and last purchase dates are noted as T1 and Tn , respectively. In that case, the customerAos holistic shopping cycle (L) can be approximated by the months between T1 and Tn, and thus, the value of T . n month. can be calculated using equation 1. ycN = yaOi. a Oe . = . cN ycu Oe ycN . Oi. a Oe . Shodikin ET AL. TABLE 3 The RFMT value for outpatient. Value Mean Minimum Maximum TABLE 4 The RFMT value for inpatient. Value Mean Minimum Maximum TABLE 5 Inpatient RFMT scoring rules Category Very low Low Medium High Very High Score R (Da. > 210 150 Ae 210 90 Ae 150 60 Ae 90 < 60 M . < 0. 6 Ae 3 3Ae6 6Ae9 T (Da. 0 or > 210 150 Ae 210 90 Ae 150 60 Ae 90 < 60 Only patients who made at least two purchases . F >= . in a given period were considered to calculate T. After preprocessing the RFMT data. the dataset included 185,196 outpatient visits involving 43,921 patients and 29,529 inpatient visits involving 24,590 patients. The table below shows the RFMT values for outpatients and inpatients. Based on Table 3 , it can be seen that the number of outpatient visits per year of 43,921 patients shows an average Recency (R) value for outpatients of 137 days, where the minimum value is 1 day, and the maximum value is 365 days. Meanwhile, the Frequency (F) value averaged 5 times in the research period. The minimum value is 1 time, and the maximum value is 97 visits a year. Based on the transaction value (M), the average outpatient costs IDR 1,234,810 per year. The minimum fee incurred is IDR 7,000, and the maximum transaction fee is IDR 27,636,522. The average distance between visits was 12 days. The largest interval between visits was 177 days. Based on Table 4 , it can be seen that the number of visits per year for inpatients is 24,590 patients, where the average Recency (R) value for inpatients is 168 days, with a minimum value of 1 day and a maximum value of 365 days. Meanwhile, the Frequency (F) value averages one hospitalization per year. The minimum value is 1 time,e, and the maximum value is 21 visits a year. Based on the transaction value (M), the average cost for inpatients is IDR 6,931,461 per year. The minimum fee incurred is IDR 544,502, and the maximum transaction fee is IDR 207,421,522. The average distance between visits was 6 days. The largest interval between visits was 179 days. To analyze clusters, we need to categorize the parameters R. M, and T into five categories (Very Low. Low. Medium. High, and Very Hig. with quintile scores of 1 to 5. Based on the Focus Group Discussion (FGD) results with Hospital Management, we agreed to categorize RFMT parameters, as shown in the following table. Based on Table 5 , it shows that the patient category is very low if the patientAos last visit was more than 210 days, the frequency of visits is only 1 time in one period, where the patientAos bill value is less than IDR 600,000. The distance between visits is zero or greater than 210 days. The patient category is low if the patientAos last visit was between 150 - 210 days, the frequency of visits is 2 times in one period, where the value of the patientAos bill is between IDR 600,000. 00 - IDR 3,000,000. 00 and the distance between visits is 150 - 210 days. The patient category is medium if the patientAos last visit was between 90 - 150 days, the frequency of visits is 3 times in one period, where the patientAos bill value is between IDR 3,000,000. 00 - IDR 6,000,000. and the distance between visits is 90 - 150 days. The patient category is high if the patientAos last visit was between 60 - 90 days, the frequency of visits is 4 times in one period, where the patientAos bill value is between IDR 6,000,000. 00 - IDR 9,000,000. and the distance between visits is 60 - 90 days. The patient category is very high if the patientAos last visit was less than 60 days. Shodikin ET AL. TABLE 6 Outpatient RFMT scoring rules Category Very low Low Medium High Very High Score R (Da. > 180 90 Ae 180 < 15 > 15 M . <0. > 1. T (Da. 0 or > 170 < 15 the frequency of visits is more than 4 times in one period, where the value of the patientAos bill is more than IDR 9,000,000. and the distance between visits is less than 60 days. Based on Table 6 , it shows that the patient category is very low if the patientAos last visit was more than 180 days, the frequency of visits is only 1 time in one period, where the patientAos bill value is less than IDR 7,000. 0,0. The distance between visits is zero or greater than 170 days. The patient category is low if the patientAos last visit was between 90 - 180 days, the frequency of visits is 2 - 5 times in one period, where the value of the patientAos bill is between IDR 7,000. 00 - IDR 100,000. 00, and the distance between visits is 90 - 170 days. The patient category is medium if the patientAos last visit was between 30 - 90 days, the frequency of visits is 5 - 10 times in one period, where the patientAos bill value is between IDR 100,000. 00 - IDR 400,000. 00, and the distance between visits is 30 - 90 days. The patient category is high if the patientAos last visit was between 15 - 30 days, the frequency of visits is 10 - 15 in one period, where the value of the patientAos bill is between IDR 400,000. 00 - IDR 1,200,000. and the distance between visits is 15 - 30 days. The patient category is very high if the patientAos last visit was less than 15 days, the frequency of visits is more than 15 times in one period, where the value of the patientAos bill is more than IDR 1,200,000. 0,0, and the distance between visits is less than 15 days. Clustering Next, the data clustering process was carried out using the DBSCAN and KMeans algorithms on a dataset with RFMT quintile In the clustering experiment, we used different epsilon (Ep. and minimum number of samples (MinPt. parameter values in DBSCAN and K parameter values in K-Means. In this study, the Silhouette Index (SI) was used to evaluate the quality of the clustering results. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) The algorithm can find high-density core samples and expand clusters from them. Two main parameters of the algorithm determine clusters: MinPts and Eps. The MinPts parameter determines the minimum number of points that can be classified as a core sample, and the Eps parameter defines the algorithmAos noise tolerance level (Pietrzykowski, 2. Determine the values of the MinPts and Eps parameters. Randomly determine the p-value or starting point. Calculate Eps or all distances from points whose density is reachable to p using the Euclidean distance formula using yccycnyc = oc . cnyu Oe yycy. yuU2yu . Where yycnyu is the variable of yu from the object i ( i =1, . , n. a=1, . , . and dij are euclidean distance values . A cluster is formed when the points that meet Eps are more than MinPts, and point p is the core point. Repeat steps 3 Ae 4 until the process is carried out at all points. If p is a border point and no points whose density is reachable concerning p, then the process continues to another point. Shodikin ET AL. K-Means K-Means is a data grouping method that can group existing data into two or more groups. This method groups data so that data with similar characteristics will be included in the same group, and data with different characteristics will be grouped into different groups (Wang et al. , 2. According to Tan . , the stages of how the algorithm works K-Means are as follows: Determine the k value to determine the number of clusters. Determine the initial value of the centroid or cluster center point randomly, and the next stage uses equation 3. ycOycnyc = ycu yco=0 ycoyc ycAycn With: ycOycnyc = cluster centroid of ycn for the variable yc ycAycn = the amount of data in the ycn cluster ycn, yco = index of the cluster yc = index of the variable ycUyco,yc = the data value of yco in the cluster for the variable yc Calculating the distance between the centroid point and the point on each object can be done using Euclidean Distance with equation 4. Oo yayce = . cuycn Oe ycycn )2 . cyc Oe ycyc )2 . With: yayce = Exclusion Distance ycn = ycaycoycuycycuycycuyce yccycaycyca . cu, y. = yccycaycycaycaycuycuycyccycnycuycaycyce . c, y. = ycayceycuycycycuycnyccycaycuycuycyccycnycuycaycyceyc Group the data to form clusters with the centroid point of each cluster being the closest centroid point, taking into account the minimum distance to the object. Update the centroid value on each cluster. Repeat steps 2 to 5 until the centroid point value does not change. Silhoutte Index (SI) One method for measuring the optimality of clustering results is the Silhouette Index method. The Silhouette Index method is a measure of validation based on internal criteria (Nahdliyah et al. , 2. This cluster validity method is based on internal criteria, where evaluation of object placement in each cluster is carried out by comparing the average distance of objects both in the same cluster and different clusters (Wang et al. , 2. To calculate the Silhouette Index value, you can use equation 5. ycAycn ( 1Oc ycI= ycu yco=0 yca. Oe yca. , yca. ) With: = the average distance of sample i to other samples in the cluster. = represents the minimum sample distance from sample i to another cluster. Shodikin ET AL. TABLE 7 The RFMT weight using Fuzzy AHP technique. Service Outpatient Inpatient Min-Max Normalization After forming clusters on the inpatient and outpatient datasets, the data normalization process is carried out on the Recency (R). Frequency (F). Monetary (M), and Interpurchase attributes. Time (T) uses the Min-Max method for the CLV value calculation Min-Max Normalization is a data processing technique used to change the value of a variable within a certain range. Min-Max will adjust to the specified limits by correlating with the original data (Patro et al. , 2. This normalization technique transforms numerical attributes on a smaller scale, such as 0 to 1 (Junaedi et al. , 2. This scale value shows that the lowest value limit is zero, and the highest is 1. Min-Max normalization can be calculated using equation 6. (Khajvand et al. , 2. ycUA = ycu Oe minyca ewmax Oe newmi. maxyca Oe minyca With: XAo = Normalized value x = Original value that will be processed for normalization minyca = Lowest value for each variable maxyca = Highest value for each variable newmin = Maximum range of ycU which has a value of 0 newmax = Maximum range of ycU which has a value of 1 Segmentation Based on CLV Analysis The RFMT-based CLV method provides a reliable basis for measuring customer lifetime value and understanding market segmentation with various recency, frequency, and monetary values. There are many assessment methods for these three variables. In the first study. Arthur Hughes . he founder of the RFMT mode. considered the RFMT scoring method, which was used to separate customers into five equal groups. According to him, these three variables are given the same weight to calculate the composite score (Hughes, 1. Meanwhile. Stone . states that different businesses have different characteristics and assign different variable weights to the RFMT measure according to the nature of the business. For example, he proposed a sequence of F. R, and M to analyze the value offered by customers who have used credit cards because frequency of use is more important in such business cases. The research application of CLV uses RFM T values normalized using the min-max method. RFM T is weighted based on relevant company-level assessments through a hierarchical analysis process (AHP) using the Fuzzy AHP Fuzzy AHP is an analysis method developed from traditional AHP. Although AHP is commonly used to handle qualitative and quantitative criteria in MCDM, fuzzy AHP is considered better at describing vague decisions than traditional AHP (Kong. F, 2. The research results for calculating RFMT weights using the Fuzzy AHP technique are shown in Table 7 . Table 7 shows the Fuzzy-AHP calculation results showing different RFMT weights for outpatients and inpatients. Outpatients have the highest weights in the F criteria . R criteria . T criteria . , and M criteria . Meanwhile, inpatients showed the highest weight in criteria M . , criteria F . , criteria R . , and criteria T . The weighted RFMT method is used to calculate the CLV for each cluster. The CLV value of each cluster can be calculated using equation 7. CLVycaycn = NRycaycn y WRycaycn NFycaycn y WFycaycn NMycaycn y WMycaycn NTycaycn y WTycaycn With = NRycaycn = Normalized Recency Value of cluster ycaycn WRycaycn = Weighted Recency NFycaycn = Normalized Frequency Value of cluster ycaycn Shodikin ET AL. TABLE 8 The patient segmentation on CLV value. Segment Patient Segmentation Champions Loyal Customers Potential Loyalists Lost Customers CLV value >= 0. 60 Ae 0. 40 Ae 0. < 0. WFycaycn = Weighted Frequency NMycaycn = Normalized Monetary Value of cluster ycaycn WMycaycn = Weighted Monetary NTycaycn = Normalized Interpurchase Time value from cluster ycaycn WTycaycn = Weighted Interpurchase Time Sequentially, the level of customer segmentation based on the CLV index calculation is Champions. Loyal Customers. Potential Loyalists, and Lost Customers. Based on the results of the Focus Group Discussion (FGD) with Hospital Management, it was agreed that the segmentation level for the CLV value was as shown in the following Table 8 . Based on Table 8 , it shows that the patient segmentation level is included in the Champions category if the CLV value is >= 0. The Loyal Customers category includes the patient segmentation level if the CLV value is between 0. 60 Ae 0. The Potential Loyalist category includes the patient segmentation level if the CLV value is between 0. 40 Ae 0. The patient segmentation level is included in the Lost Customers category if the CLV value is less than 0. RESULTS AND DISCUSSION Understanding the characteristics of each cluster group based on clustering can help hospital management implement more specific marketing strategies in the outpatient and inpatient segments to achieve better patient retention goals. The distribution of RFMT on a discrete scale can be seen in Figure 2 and Figure 3 . Figure 2 shows the distribution of outpatients with the highest R score of 15,000 patients who fall into category 1 . ery lo. The highest F score of 25,000 patients was in category 1 . ery lo. The highest M score was 18,000 patients who fell into category 3 . The highest T scores of 19,000 patients were in category 1 . ery lo. The RFMT variable shows an uneven distribution. Patients with the same scores had similar outpatient or inpatient visit behavior. Clustering Inpatient RFMT Dataset Based on the results of the Elbow method for the inpatient dataset in Figure 4 , it shows that the best eps parameter value for the inpatient dataset is 0. Based on the results of the Elbow method for the inpatient dataset in Figure 5 , it can be seen that the best k parameter value for the inpatient dataset is 10. Results Testing the quality of the RFMT dataset clustering results for inpatients using the DBSCAN and K-Means algorithms using Silhouette Index (SI) values has been carried out. Clustering uses the K-Means algorithm, which has the best SI value of 0. 58 and an optimal k value of 10. Clustering uses the DBSCAN algorithm, which has the best SI value 99, a value of 0. 8, and a MinPts value of 8. Clustering the Outpatient RFMT Dataset Based on the results of the Elbow method for the outpatient dataset in Figure 6 , it shows that the best eps parameter value for the outpatient dataset is 0. Based on the results of the Elbow method for the outpatient dataset in Figure 7, it can be seen that the best k-parameter value for the outpatient dataset is 7. The results of testing the quality of the outpatient RFMT clustering dataset using the DBSCAN and K-Means algorithms using Silhouette Index (SI) values have been carried out. Clustering uses the K-Means algorithm, which has the best SI value of 0. 47 and an optimal k value of 7. Clustering uses the DBSCAN algorithm, which has the best SI value 91 with an Eps value of 0. 9 and a MinPts value of 75. Shodikin ET AL. FIGURE 2 The distribution of outpatient RFMT: . ycycycaycuyco ,. yceycycaycuyco , . ycoycycaycuyco and, . TABLE 9 The RFMT value based on segmentation inpatients. Segmentation Champions Loyal Customers Potential Loyalists Lost Customers Value RFMT value Mean Min Max Patient Segmentation After forming clusters on the inpatient and outpatient datasets, the data normalization process was carried out on the Recency (R). Frequency (F). Monetary (M), and Interpurchase Time (T) attributes using the theMin-Maxx method and the CLV value was calculated based on The weighted RFMT method uses the Fuzzy AHP technique. Patients are divided into four categories for patient segmentation based on each CLV value cluster: 1. Champions, 2. Loyal Customers, 3. Potential Loyalists, and 4. Lost Customers. RFMT values based on inpatient segmentation are presented in Table 9 and Table 10 below. Based on Table 9 above, it can be seen that: Shodikin ET AL. FIGURE 3 The distribution of inpatient RFMT . ycycycaycuyco ,. yceycycaycuyco , . ycoycycaycuyco and, . FIGURE 4 Silhoutte Index for finding optimal eps and minPts values for the DBSCAN algorithm In the champions segment, the average transaction value for inpatients is more than IDR 9,000,000, where the last visit to the hospital was 90 days ago, the frequency of visits is 4 times in one period, and the distance between visits is 60-90 days. In the loyal customer segment, the average transaction value for inpatients is more than IDR 6,000,000, where the last visit to the hospital was 90 days ago, the frequency of visits is 3 times in one period, and the distance between visits is 60-90 days. In the potential loyalist segment, the average transaction value for inpatients is IDR 3,000,000 Ae IDR 6,000,000, where the last visit to the hospital was 90 days ago, the frequency of visits is 1 time in one period and the distance between visits is 90- 150 days. Shodikin ET AL. FIGURE 5 Silhoutte Index for finding the most optimal k value for the K-Means algorithm FIGURE 6 Silhoutte Index for finding optimal eps and minPts values for the DBSCAN algorithm In the lost customers, the average transaction value of inpatients is more than IDR 3,000,000 Ae IDR 6,000,000, where the last visit to the hospital was 150-210 days ago, the frequency of visits is 1 time in one period or the distance between visits are more than 210 days. Based on Table 10 above, it can be seen that: the champions segment, the average outpatient transaction value is IDR 1,000,000, where the last visit to the hospital was 15-30 days ago, the frequency of visits is 10-15 times in one period, and the distance between visits is 15-30 days. In the loyal customer segment, the average outpatient transaction value is IDR 1,000,000, where the last visit to the hospital was 30-90 days ago, the frequency of visits is 5-10 times in one period,d and the distance between visits is 15- 30 days. In the potential loyalist segment, the average outpatient transaction value is IDR 400,000-IDR 1,000,000, where the last visit to the hospital was 90-180 days ago, the frequency of visits is 2-5 times in one period, and the distance between visits is 15-30 days. In the lost customerAos segment, the average outpatient transaction value is IDR 400,000-IDR 1,000,000, where the last visit to the hospital was 90-180 days ago, the frequency of visits is 2-5 times in one period, or the distance between visits is more than 15-30 days. Shodikin ET AL. FIGURE 7 Silhoutte Index for finding the most optimal k value for the K-Means algorithm TABLE 10 The RFMT value based on segmentation outpatient patients. Segmentation Champions Loyal Customers Potential Loyalists Lost Customers Value RFMT value Mean Min Max TABLE 11 Segmentation of each inpatient cluster Segmentation Champions Loyal Customers Potential Loyalists Lost Customers Count CLV value Mean Min Max The research results show that there are four segments of inpatients, namely: Champions category, there were 473 . %) patients who were members of 14 clusters . Loyal Customers category is 1,727 . %) patients who are members of 31 clusters . Potential Loyalist category is 3,874 . %) patients who are members of 31 clusters . Category Lost Customers were 18,516 . %) patients who were members of 14 clusters . Shodikin ET AL. TABLE 12 The segmentation of each outpatient cluster. Segmentation Champions Loyal Customers Potential Loyalists Lost Customers Count CLV value Mean Min Max The research results show that there are four segments of outpatients, namely: Champions category there were 3,512 . %) patients who were members of 10 clusters . Loyal Customers category is 5,661 . %) patients who are members of 24 clusters . Potential Loyalist category is 11,070 . %) patients who are members of 34 clusters . Lost Customers category is 23,678 . %) patients who are members of 31 clusters . CONCLUSION This study used RFMT values, the min-max, and weighted RFMT methods to determine CLV. The results showed that outpatient and inpatient segmentation were categorized into patient segment segments on similar behavior according to the RFMeach patientAos RFMT value model, which can characterize patient visit behavior by investigating the elapsed time. The RFMT model does not focus on attracting new patients but on identifying the best patients to improve marketing performance. This research shows four segmentations of outpatients and four inpatients based on CLV values: Champions. Loyal Customers. Potential Loyalists, and Lost Customers. Inpatient segmentation includes: Champions Category, 473 . %) patients belonging to 14 clusters . 56%). In the Loyal Customers category, 1,727 . %) patients were members of 31 clusters . 44%). In the Potential Loyalist category, there were 3,874 . %) patients in 31 clusters . 44%). In the Lost Customers category, 18,516 . %) patients were members of 14 clusters . 56%). Outpatient segmentation includes: Champions Category, 3,512 . %) patients belonging to 10 clusters . 10%). In the Loyal Customers category, 5,661 . %) patients are members of 24 clusters . 24%). In the Potential Loyalist category, 11,070 . %) patients were members of 34 clusters . 34%). In the Lost Customers category, 23,678 . %) patients were in 31 clusters . 31%). Segmentation of inpatients and outpatients is important in hospital management and mapping marketing strategies. Based on CLV analysis using the RFMT model developed in this research, management can use the segmentation of outpatients and inpatients to manage better patient services. Future research could develop a prediction model for patient visits using machine learning methods. CREDIT Mohammad Shodikin: Methodology. Software. Investigation. Validation. Resources. Data Curation. Writing - Original Draft, and Visualization. Chastine Fatichah: Conceptualization. Methodology. Validation. Writing - Review & Editing, and Supervision. Nungky Taniasari: Conceptualization. Methodology. Validation. Writing - Review & Editing, and Supervision. References