ISSN 2087-3336 (Prin. | 2721-4729 (Onlin. TEKNOSAINS: Jurnal Sains. Teknologi dan Informatika Vol. No. July 2024, page. http://jurnal. id/index. php/tekno DOI: 10. A Design of cross-selling products based on frequent itemset mining for coffee shop business M Zaky Hadi*. Ari Hasudungan Pratama Pasaribu. Fatin Saffanah Didin. Lina Aulia Department of Industrial Engineering. Faculty of Industrial Technology. Institut Teknologi Sumatera. Indonesia, 35365. Jl. Terusan Ryacudu. Way Huwi. Jati Agung. Lampung Selatan. Lampung. Indonesia Submitted: 07/02/2024 hadi@ti. Revised: 01/04/2024 Accepted: 07/04/2024 ABSTRACT Using a case study at the XYZ coffee shop in Bandar Lampung. Indonesia, this work conducts knowledge extraction on a sales dataset to create a cross-selling model for product bundling advice. Understanding Knowledge extraction was done using a frequent itemset mining technique based on an Apriori algorithm to extract a set of association rules between products. The study implemented a five-stage frequent itemset structure that encompasses business comprehension, data preparation, data exploration, model creation using the Apriori algorithm, and rules evaluation. The framework that comes from this research provides a set of bundling association rules across items for cross-selling strategies that involve many products and complex sales Additionally, we recommended sales activities based on loyalty cards to enhance our dataset with consumer attributes and purchasing trends. Based on customized services and tailored offers based on consumers' past purchases and spending patterns, the loyalty card recommendation was created. Consequently, we may target the appropriate clients and items with our marketing initiatives. Keywords: Algorithms apriori. frequent itemset mining. cross-selling. association rules. data analytics INTRODUCTION Cross-selling is a marketing tactic in which vendors present customers with complementary products or services while they are completing a transaction . The cross-selling technique entails providing clients with an extra product or service in exchange for their purchase, or grouping things together at a discounted price . The key benefits of cross-selling include higher sales revenue, improved customer satisfaction, and, for B2B companies, higher Customer Lifetime Value (CLV) through deeper integration in a customer's business, customer-specific offers, loyalty and engagement, support for promotional programs, lower selling costs, and increased product referrals . Cross-selling can be done successfully in a number of ways. The most widely used conventional method involves the salesman interpreting and analyzing the needs of the customer . Although traditional cross-selling strategies have been used for a while, they have drawbacks . These strategies' primary drawback is how much they depend on the salesperson's aptitude for interpreting and interpreting the wants of the consumer. Although competent salespeople can be highly successful at this, there are numerous circumstances in which they might not be able to fully understand the wants of the consumer. Sales may suffer as a result, as well as chances lost. Traditional cross-selling strategies also have the drawback of being time-consuming . Salespeople often have to spend a great deal of time understanding the needs of the customer before they can even begin to recommend additional products or services . Resources may be depleted by this, particularly if the salesperson is juggling several TEKNOSAINS: Jurnal Sains. Teknologi dan Informatika is licensed under a Creative Commons Attribution-NonCommercial 4. 0 International License. ISSN 2087-3336 (Prin. | 2721-4729 (Onlin. ISSN 2087-3336 (Prin. | 2721-4729 (Onlin. DOI 10. 37373/tekno. clients at once. Additionally, customer attitudes and habits are rapidly evolving. In the big data era, cross-selling strategies driven by artificial intelligence (AI) are becoming more and more common as a means of overcoming these limitations . AI-driven cross-selling strategies that leverage analytics, data mining, and machine learning are transforming the sales environment . These methods provide new opportunities and benefits for all kinds of enterprises, and they are flexible and adaptable enough to deal with the ever-changing needs of their clientele. One popular method of using AI to enhance cross-selling tactics is to mine item sets to derive rules about relationships between two or more products . Frequent itemset mining is the process of locating frequent item sets with the aim of establishing association rules between items in the dataset . To extract it, market basket data is analyzed for frequently recurring if/then patterns. Then, the most significant links are identified by using criterion support and confidence . technique known as association rules minin. A set of rules describing the frequency and strength of the relationships between the variables or items is the result of association rule mining. This will make it easier and more intuitive for the marketing manager to recognize the patterns and linkages, pinpoint the primary themes or product bundling categories, and simplify the process of making cross-selling selections. With a case study of the marketing process at Coffee Shop XYZ in the city of Bandar Lampung, this research offers a product cross-selling strategy through the use of frequent itemset mining and association rules mining to extract knowledge. Through the use of convincing advertising campaigns and posters affixed to the storefront, the management of Coffee Shop XYZ introduced and promoted their products. This technique was no longer effective as the market and business rivalry increased because it only improved sales by 1-5% and increased the likelihood of a significant loss of 5-10% in the previous 6 months. To create a new product offering plan, management had to use past transaction datasets from the sales information system database and use them to create cross-selling promotions or product bundling strategies . To offer product bundling and personalized recommendations for customers by selling complementary products or services to customers when they are making a purchase. Coffee Shop XYZ must design a cross-selling strategy through the extraction of historical transaction datasets. By doing this, the business can increase sales while providing excellent customer In this study, we used a knowledge extraction and data mining approach to create a cross-selling strategy to address such issues. Two separate research goals were identified from the frequent itemset mining and association rules mining for the extraction of a cross-selling strategy. First, we extracted knowledge as association rules to find rules for product pairs that frequently . ften appear togethe. from customer buying patterns based on historical transaction dataset. Second, as part of a cross-selling strategy, we created product suggestions based on criteria discovered through knowledge extraction for product offerings. To create a more representative rule model based on past customer purchase trends, these two goals must be met. Only then will the representative rule model be suitable for use as decision support for cross-selling tactics. This study's primary contribution is new data and a design framework for frequent itemset mining marketing techniques in coffee shop. The remainder of this study is organized as follows: in order to accomplish our research goals, we first present our research framework. The techniques created a customized framework for the data mining project to match the stated issue and improve the viability of the suggested solution. The results of our data analytics work on the transactional dataset from Coffee Shop XYZ are presented in the next This includes rule extraction from the dataset and both univariate and bivariate data analysis. Additionally, we offer rule performance to help choose appropriate rules for Coffee Shop XYZ marketing initiatives. The viability of the extracted cross-selling approach employing association rules and frequent itemset mining was then tested through implementation, verification, and validation . The final part, "Conclusion," summarizes the results in light of the study's goals and possible future METHOD We created a conceptual framework for this study based on data mining and knowledge extraction techniques, which we then adjusted in the modeling portion to meet our goals. The conceptual Ari Hasudungan Pratama Pasaribu. M Zaky Hadi. Fatin Saffanah Didin. Lina Aulia A Design of Cross-Selling Products Based on Frequent Itemset Mining for Coffee Shop Business framework is shown in Figure 1. Start Business objectives Business Expected knowledge Data Association rule extraction with apriori algorithm Association Evaluation of the performance of association rules . roduct combination, support, confidence, lif. Data transaction Association rules performance . roduct combination, support, confidence, lif. Data Data ready to be Determining support and Second objective First Combination of product and rules performance Accepted Provide product croselling strategy Provide cross-selling strategy Evaluation . erification and Evaluation product cross-selling strategy recommendations Finish Figure 1. Research framework. Our framework for the study is shown in Figure 1. To accomplish our study aims, the framework consists of five primary stages. Understanding business processes, comprehending datasets, preparing data, modeling with the Apriori algorithm, and assessing rules are some of these activities . Data and business comprehension. Business understanding includes setting clear goals, identifying the issue and pain spots in the business, and assessing the available resources, including time, stakeholders, and dataset requirements. The main objective is to ensure that the project is feasible to achieve the desired outcome and that it aligns with the overall business strategy and detailed business Collaboration and communication with stakeholders from many departments are also necessary at this phase to ensure that the project is in line with the larger business plan. Consequently. ISSN 2087-3336 (Prin. | 2721-4729 (Onlin. DOI 10. 37373/tekno. the coffee shop XYZ can now more easily undertake the data analytics project. It fortifies the foundation of business knowledge and focuses attention on data research that could help accomplish the project's Data mining's initial phase, known as data exploration, involves describing the dataset's attributesAilike its size, quantity, and kindAiusing statistical techniques and data visualization to enhance our understanding of the data's nature. This phase consists of the following four tasks: Compile initial data, describe data, examine data, and ensure data quality. Data processing. As a general guideline, 80% of the project is dedicated to data preparation. This procedure, which is also referred to as "data munging," gets the final data set or sets ready for modeling. Data selection, data cleaning, data building, data integration, and data formatting are the five jobs that make up this process. The process of choosing data specifies which datasets will be used and explains the inclusion/exclusion reasoning. Data cleaning, which includes fixing, imputing, or eliminating inaccurate values such as missing numbers and errors, is a common procedure in data mining. Mining association rules can be done with the additional attributes produced by data construction or dimensionality reduction . The practice of combining data from several sources to create a new dataset is known as data integration. Reformatting data as necessary to facilitate mathematical operations is known as formatting. The concept of data mining is based on modeling and evaluating rules, which are virtual structures that represent data grouped for predictive analysis in cross-selling strategy . This phase is built using an algorithm on the preprocessed data. The transaction dataset was divided into weekend and weekday categories before being further categorized into lunchtime, afternoon, and evening. Next, using descriptive analytics, the categorized dataset was visualized. During this stage, the Apriori algorithm was applied by calculating the minimal support and confidence through a process of trial and error until a product combination was produced. Next, using the following formulation, we assessed the product combination by determining the ruleAos performance in terms of lift, confidence, and support. ycIycycy. cU Ie ycU) = ycIycycy. cUycU) = . cUycU)| yaycuycuyce. cU Ie ycU) = ycE. cU) = yaycnyceyc. cU Ie ycU) = ( ) ( ). ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( Ie ) ( ). ( ) ( ) An association rule is represented by the equation X Ie Y, where X and Y are item sets. X and Y stand for the rule's antecedent and consequent, respectively. Association rule mining, when combined with frequent item set mining, reveals interesting connections and correlations between large sets of data items between X and Y. Support in formula . is the number of transactions that contain both X and Y. Confidence . is the conditional likelihood that a transaction contains the consequent Y given that it contains the antecedent X. Lift is defined as the ratio of the joint probability of X and Y that is seen to the joint probability that would be expected if they were statistically independent. A lift of 1 indicates independence between X and Y in the event of an association rule X Ie Y. If the lift value is more than 1, then X and Y have a positive association. If the lift is less than 1, there is a negative association between X and Y. The R Studio software is used to compute and extract rules. Recommendations for a cross-selling plan, validation, and verification. Following the modeling and assessment stages, cross-selling tactics and product bundling were designed using the X-Y rules, which were then offered at a discount. Evaluations of lift ratio, confidence, and support were used to verify. The cross-selling strategy for the marketing offering program was implemented at coffee shop XYZ as a means of validation. Subsequently, the cross-selling program's evaluation was carried out based on incremental selling for every product combination. This study offers confirmation. RESULTS AND DISCUSSION To determine the business process in the coffee shop XYZ and to do an exploration task using the transaction dataset, business understanding, and data exploration were carried out. In contrast to data understanding, which includes data structure analysis, exploratory data analysis, univariate data analysis, and bivariate data analysis, marketing business process and product information stems from this stage. Ari Hasudungan Pratama Pasaribu. M Zaky Hadi. Fatin Saffanah Didin. Lina Aulia A Design of Cross-Selling Products Based on Frequent Itemset Mining for Coffee Shop Business Details on the item set and business processes. The marketing business process consists of four steps: planning a marketing strategy, implementing it at the operational level in the coffee shop, conducting a management meeting with top-level management to evaluate marketing, and recording marketing documentation in the database management system using cashier software. The main goal of this approach is to pique customers' interest in purchasing as many goods as they can. The owner of the coffee shop, the marketing manager, the cashier, and the customer at XYZ Coffee Shop participate in this process. The cashier application support is used to retrieve information from the cashier software database management system. Transaction numbers, the transaction date, the itemset, and the total price are all included in the dataset. The transaction dataset matrix is presented in Table 1 below. Table 1. The result of exporting transaction dataset from the cashier software in coffee-shop XYZ. Transaction Number Date and Time Itemset Total Price (IDR) 26/08/2021 11:24 26/08/2021 11:40 26/08/2021 11:54 26/08/2021 11:54 26/08/2021 11:54 26/08/2021 12:30 26/08/2021 13:07 AA french fries pandan milk coffee Milk Coffee Milk Coffee Toast chocolate cold beef black paper Same Analysis of data structures. The data structure needs to be separated into variables so that the Apriori algorithm can calculate them, as seen in Table 1. Since they weren't utilized in the extraction procedure of the rule, the transaction number in the first column and the total price in the fourth column need to be In the meantime, midday, evening, and overnight will be distinguished in the Transaction Time column. Transactions per customer will be created from the itemset column. This will aggregate many rows of transactions that are recognized as being completed by a single customer, such as rows 3, 4, and 5, which include transactions from customers with ID 3. Variables $ Transaction $ Time $ Item $ Classification $ Day Table 2. The data exploration result using rstudio. Tibble . ,169 x . (S3:tbl df/tbl/data. Data Type Value Num . :1. 1 2 3 3 3 4 5 6 6 6 AA Format: Au2021-08-26 11:24:00Ay Au2021-08-26 11:40:00Ay POSIXct [ 1:1. Au2021-08-26 11:54:00Ay Au2021-08-26 11:54:00Ay A AuFrench friesAy AuPandan Milk CoffeeAy AuMilk CoffeeAy AuMilk Chr [ 1:1. CoffeeAy A. Chr [ 1:1. AuMiddayAy AuMiddayAy AuMiddayAy AuMiddayAy Chr [ 1:1. AuWeekdayAy AuWeekdayAy AuWeekdayAy AuWeekdayAy AA Analysis of exploratory data. Before performing data cleaning and computations, the data exploration analysis gathers a sizable amount of unstructured data in order to determine current values. In order to identify relationships between the objects that need software support, data exploration was RStudio was the software we used to process the data. Based on the number of observations and variables, as well as the types and names of the variables, this method generates the dataframe's The structure of the dataset is displayed in Table 2, which has 1,169 rows of data with 5 variables and POSIXct, char, and numeric data types for each column. Analysis of Univariate Data. One variable is used in univariate data analysis and visualization. Using time series and categorization, this descriptive data visualization consists of two features: it shows the frequency and peak hours when customers visit the coffee shop. Figure 2 shows data classified by midday, evening, and overnight . , as well as by date . Figure 2 demonstrates that although the distribution of each date varies and tends to be uniform, customers generally purchase more in the evening and less at night. Consumers tend to visit more ISSN 2087-3336 (Prin. | 2721-4729 (Onlin. DOI 10. 37373/tekno. frequently on weekdays than on weekends, if weekend and weekday classifications are used (Figure . This makes sense because office workers and students are the coffee shop's target market. daily transaction transactions based on time classification COUNT Classification Nightime Evening Midday Count DATE Count Figure 2. Descriptive data visualization based on univariate data analysis. Weekday Weekend Figure 3. Distribution of transactions by day. Analysis of bivariate data. The outcome of the data exploration stage is a descriptive visualization of the two variables. The purpose of this step was to determine how sales classes were distributed across weekdays and weekends during lunchtime, evening, and night (Figure . Figure 4 displays regular statistics indicating that weekday afternoons are when Coffee-Shop XYZ has the biggest sales. Midday Evening Nighttime Midday Evening Nighttime Weekday Weekday Weekday Weekend Weekend Weekend Figure 4. Bivariate analysis results for total transactions based on 2 classifications. Ari Hasudungan Pratama Pasaribu. M Zaky Hadi. Fatin Saffanah Didin. Lina Aulia A Design of Cross-Selling Products Based on Frequent Itemset Mining for Coffee Shop Business Data Gathering. Data preparation comes next after data exploration. This phase serves the purpose of preparing the data for processing with the Apriori algorithm. There are two categories for this stage: data transformation and data cleaning. To find noisy and incomplete data, data cleansing is done. The inspection outcome of the data cleaning procedure is shown in Table 3. Table 3. Results of checking noisy data and incomplete data. Americano cold Beef black paper Caramel macchiato Chicken pop sambal matah French fries Hazelnut latte hot Palm milk coffee Mango tea Mochaccino cold Naga Lychee Toast Strawberry lychee Milk jelly cold Vanilla milk Waffle Table . diksi_df$Ite. Americano hot Aqua Blue curacao Blue curacao Caramel macchiato Chicken karaage Chocolate cold Churros Garlic bread Hazelnut chocolate Hazelnut chocolate Ice cream Pandan milk coffee Lemon tea Matcha green tea cold Matcha green tea cold Mochaccino cold Mochaccino hot Naga banana Onion ring Canai bread Sosis barbeque Strawberry tea Susu caramel Lychee milk Susu lychee Taro Thai tea grilled chicken Caramel latte cold Chicken katsu Crispy chicken skin Hazelnut latte Banana pleasure Caramel macchiato Chicken pop sambal Espresso Hazelnut latte cold Ice cream toast Milk coffee Lychee mojito Matcha green tea hot Machos Oranio boba Spaghetti aglio olio Hazelnut milk Rum milk Vanilla latte cold Lychee tea Fried noodle Naga lychee Passion fruit Spaghetti bolognaise Milk coffee with jelly Vanilla milk Vanilla latte cold Table 3's findings demonstrate that no noisy data or missing values were discovered, indicating that no cleaning procedure was used. Not every variable is used in the association rule mining process' In order to convert the raw data from the sales database system from Table 1 to Table 4, a transaction dataset containing consumer IDs and the products they buy in each transaction is needed for this extraction process. Table 4. Dataset transformation from table 1 into standard transaction data format. Transaction Number Itemset French fries Pandan milk coffee Milk coffee, milk coffee, toast Chocolate cold Beef Black paper Beef Black paper Modeling is the next step. Using an a priori algorithm, the modeling stage explains the data processing and processing outputs. Transaction data that has undergone feature selection, data cleansing, and data input is the data that is used. The data was additionally split into weekend and weekday categories during the modeling step, and the weekend and weekday categories were further subdivided based on midday, afternoon, and night. Every computation has a visual representation. ISSN 2087-3336 (Prin. | 2721-4729 (Onlin. DOI 10. 37373/tekno. Every Transaction. Rules comprise the "all transaction" data frame, which establishes minimum support and confidence. Following modeling, support, confidence, top item frequency, and statistical visualization computation results were acquired. Following a period of trial and error, the minimal support and confidence were found to be 0. 01 and 0. 2, respectively. The outcomes of mining frequent item sets and association rules for every transaction are shown in Figure 5. The top item frequency plot, shown in Figure 5 and Table 5, displays the ten most commonly ordered items across the whole transaction data frame. Under the current circumstances, this data can be used for product sales and business concerns. Historical data indicates that coffee made from palm milk is a best-selling . requent ite. The product pair rules are also displayed in Figure 5 and Table 5. For example, the first rule states that if a customer purchases crispy chicken skin, they should also purchase hazelnut chocolate cold, with a lift of 2. 8, a support of 1%, and a confidence of 30%. This suggests a strong relationship between the two items in the dataset. Palm milk cofee item frequency . Hazelnut chocolate milk coffee Chocolate cold French fries Pandan milk coffee Oranio boba Figure 5. Results of extraction of association rules for all items. Table 5. Matrix of rules extraction and its performance for all items. > Inspect. support confidence coverage . chicken ski. chocolate col. hicken pop => {Passion frui. hicken pop sambal mata. hicken pop => . ranio bob. eef black eef black ranio bob. => chocolate col. Both during the week and on the weekend. The pair extraction results for weekdays are shown in Table 6, where there are seven total resulting combinations, but the weekend results show four product combination rules. A product bundle that is offered on weekday evenings is the Chicken Pop Sambal Matah bundle with passion fruit beverages, as shown in Table 7. Meanwhile, the spaghetti aglio olio bundle with passion fruit serves as an example for the weekend. Ari Hasudungan Pratama Pasaribu. M Zaky Hadi. Fatin Saffanah Didin. Lina Aulia A Design of Cross-Selling Products Based on Frequent Itemset Mining for Coffee Shop Business Table 6. Matrix of rules extraction and its performance for weekdays. > Inspect. ules_weekday. Rhs support confidence coverage . hicken pop sambal => . assion frui. eef black . hicken pop sambal mata. eef black . eef black . eef black . {Spaghetti aglio oli. {Grilled . rispy Table 7. Extraction results for weekday evening. > Inspect. ules_weekday_evenin. Rhs support confidence coverage => {Passion frui. {Pandan milk . azelnut In the meanwhile. Table 8. Table 9. Table 10. Table 11. Table 12, and Table 13 display the findings for the weekend and weekday lunchtime, evening, and nighttime. When a customer purchases Chicken Pop Sambal Matah, for instance, they are typically presented with Passion fruit . long with the This is an example of an offer transaction for a Saturday afternoon. hicken pop Table 8. Extraction results for the weekend at midday > Inspect. ules_weekend_midda. Rhs support confidence coverage . qua,oranio . qua, . ranio ISSN 2087-3336 (Prin. | 2721-4729 (Onlin. DOI 10. 37373/tekno. eef black paper, ice cream toas. {French fries, ice cream . eef black paper. French > Inspect. ules_weekend_midda. Rhs support confidence coverage {French . ce Table 9. Extraction results for the weekend in the evening. > Inspect. ules_weekend_evenin. Rhs support confidence coverage lift count . eef black . alm milk . aglio oli. eef black . alm milk paper, chicken => . hicken katsu, . eef black palm milk . eef black . hicken paper, palm milk coffe. Table 10. Extraction results for the weekend at night time. > Inspect. ules_weekend_nighttim. support confidence coverage . => . ilk coffe. ilk coffe. => . => . ranio bob. Table 11. Bundle offering during transactions for weekdays at midday. > Inspect. ules_weekday_midda. eef black paper, lychee . eef black Ari Hasudungan Pratama Pasaribu. M Zaky Hadi. Fatin Saffanah Didin. Lina Aulia A Design of Cross-Selling Products Based on Frequent Itemset Mining for Coffee Shop Business > Inspect. ules_weekday_midda. hicken karage, lychee . eef black Table 12. Bundle offering during transactions for weekdays in the evening. > Inspect . ules_ weekday_nighttim. support confidence coverage . hicken pop => . ranio bob. hicken pop . anilla latte latte col. anilla latte . azelnut latte . aramel latte col. anilla latte . aramel latte Table 13. Bundle offering during transactions for weekdays in the evening. > Inspect . ules_ weekday_evenin. support confidence coverage . paghetti => . assion frui. aglio oli. andan milk . rispy chicken Validation, verification, and implementation. These guidelines are applied when creating bundles that provide customers with cheaper product combinations. After a week of testing, the implementation's findings showed a 30% gain in sales, which was larger than the previous weeks' sales, which had been trending downward with a drop in the range of 5Ae10% and an increase of 1-5% in total product sales. For the product offering process. Coffee-Shop XYZ introduced a loyalty card to make it more customized for each customer. CONCLUSION The research's goals have led to the following conclusions being reached. First, products with positive connections and frequent item sets are those that exhibit time-varying rules, as determined by using knowledge extraction and association rules mining to Coffee Shop XYZ transaction data. Based on weekend and weekday periods, this study offers suggestions for lunchtime, evening, and nighttime. The design approaches for product bundling both increase sales. By offering loyalty cards, personalization must be made to facilitate the extraction of association rules. One limitation of this research is that the tailored process was not conducted based on customer classification, which determines what is lucrative and non-profitable based on the loyalty card operations that have been ISSN 2087-3336 (Prin. | 2721-4729 (Onlin. DOI 10. 37373/tekno. For this reason, cohort analysis, customer segmentation, and customer personalization are examples of advanced analytics procedures that are needed. REFERENCES