Riska Indarwati / Int. Journal of Information Technology and Business. Vol. No. IJITEB Vol. No. International Journal of Information Technology and Business http://ejournal. edu/ijiteb Sales Data Analysis and Visualization for Distribution Optimization (Case Study: US Candy Distributio. Riska Indarwati1*. Herti Yani2. Beny BenyA Faculty of Computer Science. Information Systems. Universitas Dinamika Bangsa. Indonesia Abstract : Optimization of US Candy's product distribution was conducted through Keywords : Data Analysis. Data Visualization. Power BI. Distribution. Logistic Efficiency. sales data analysis and visualization. The main challenges identified include less strategic factory locations and high logistics costs. This research used a quantitative The purpose of this analysis is to analyze the distribution pattern. Data visualization was conducted using Power BI to show the results of the analysis that New York is the highest selling city with total sales of $10,945. 89, with the leading product being Wonka Bar - Scrumdiddlyumptious ($2,. The consistent increase in sales trend in the fourth quarter of each year, triggered by celebrations such as Halloween, special promotions, and holidays can be utilized to develop future marketing and product inventory strategies. A distribution optimization strategy is recommended to minimize logistics costs and delivery time, by moving distribution to the Wicked Choccy's factory with a distance of 8237. 2 km. Marketing strategies such as discounting best-selling products and bundling system for low-selling products are proposed to improve market competitiveness. The resulting interactive dashboard implementation is expected to assist the company in making more accurate and efficient data-based decisions. Introduction 1 Background Efficient distribution of goods is a crucial aspect in the retail industry, especially in the consumer products sector such as candy. US Candy is a candy distribution network in the United States that covers various regions with diverse needs and demands. Effective distribution management plays an important role in ensuring products are available at strategic locations, such as retail stores, supermarkets, and seasonal events, thereby increasing customer satisfaction as well as company profitability . In practice, distribution companies often face complex operational challenges. One of the main challenges is the wide geographical coverage that leads to increased delivery times and logistics costs. In addition, imbalances in the placement of distribution facilities can hamper product availability in high-demand regions . To overcome these problems, delivery route optimization is an indispensable strategy to improve distribution Route optimization involves determining the most effective delivery path based on factors such as distance, cost, traffic conditions, and vehicle capacity, with the aim of reducing travel time as well as operational costs . On the other hand, advances in information technology have opened up new opportunities in supply chain management. The use of digital systems enables the integration of business flows, increases operational transparency, and supports more accurate data-based decision-making . However, many companies still rely on manual methods in operational data processing, which hinders effective analysis of distribution patterns. Difficulties in identifying sales trends, evaluating product performance, and understanding consumer preferences often lead to suboptimal decision-making . Therefore, this research aims to design an interactive dashboard-based analysis system that is able to process information visually and The implementation of this dashboard is expected to help companies recognize optimal distribution patterns, determine priority areas based on demand levels, and develop more effective strategies in managing distribution and reducing logistics costs. 2 Literature Review 1 Previous Research Previous research by Qodri and Alijoyo discussed Fast Fashion Supply Chain Analysis Using Exploratory Data Analysis with Power Business Riska Indarwati / Int. Journal of Information Technology and Business. Vol. No. Intelligence. The challenge faced regarding the fast fashion supply chain is the need for in-depth understanding and comprehensive analysis of the fast fashion supply chain. the method used in previous research was Exploratory Data Analysis (EDA) based on Power Bi with a Knowledge Discovery in Database (KDD) approach that allows the results of analyzing information related to the fast fashion supply chain . Najib and Stefany in their research produced data visualization using Power BI, including sales trends, customer consumption patterns, best-selling product categories, and purchase patterns based on Based on data visualization, analysis results are obtained to increase operational efficiency, improve customer service quality, and optimize marketing This research adds to the literature on the application of Business Intelligence (BI) technology in the retail sector and provides practical guidelines competitiveness . Tjahyono and Susilowati in their research included several steps starting with data collection and interviews, data transformation with the ETL method, report visualization. The output results of this study are in the form of visualization charts in Power BI such as total sales and distribution of goods in 3 years, top 5 stores with the most purchases each year, and other outputs according to the needs of UD Anugerah Sejahtera Jaya . Ratna and Dewi in their research conducted by processing data using excel with the Exploratory Data Analysis (EDA) method to identify sales trends and products with the highest sales in each region. After analyzing and visualizing the data, it is found that one of the effective strategies to increase profits by 15% is to apply the right discount policy . Sadewo et al. conducted research on Visual Analysis of Job Type Distribution in Bandung City Using Power BI. The method used in this research is visual analysis, which combines visualization and data analysis techniques. Data visualization shows the occupational distribution of Bandung City residents through bar charts and pie charts, with self-employed as the largest group . 2 Analysis Analysis is the process of breaking data into smaller components based on certain elements and structures . the purpose of analysis is to recognize a number of data obtained from a certain population in order to obtain conclusions that are used to determine policies or make decisions as a step in providing problem solving for a problem . With data-based decision making, it allows organizations to make more informed and effective decisions . 3 Optimization Optimization is the determination of the level of use of resources or inputs that can reduce costs as low as possible and produce maximum profit or income . The purpose of optimization is to minimize the effort required or to maximize the desired results, and to define the process, it is important to imply the function of the decision variables . The benefits of optimization include cost savings, increased productivity, competitive advantage, and increased customer satisfaction . 4 Distribution Distribution is the distribution or distribution or availability of products in an area or sales area, where this distribution can be used as an indicator of sales potential in the area . There are several types of distribution, namely direct distribution, semi-direct distribution, and indirect distribution . The purpose of this distribution is to ensure that the product reaches its destination on time, in good condition, and with efficient shipping costs . 5 Data Visualization Data visualization is the process of transforming data that is generally in tabular form into visual images . Visualization aims to communicate information clearly and effectively through graphical elements . Types of commonly used data visualizations include bar charts, line charts, pie charts, histograms, box plots, maps, scatter plots, and heatmaps . 6 Power BI Power BI is a Business Intelligence (BI) tool or platform that offers tools to collect, analyze, visualize, and share data. Power BI is made to visualize data with three main characteristics, namely output . isual dat. is attractive to look at, easy to understand, and easy to interpret . Power BI has several advantages, namely sharing data, real time dashboards, and can process original data that exceeds the capacity of other applications . Research Method 1 Research Stages Based on the research method above, it can be described as follows: Fig. Research Framework Problem Identification The first stage of this research was problem Riska Indarwati / Int. Journal of Information Technology and Business. Vol. No. identification, which aimed to understand the main challenges in US Candy's distribution. Analysis was conducted to identify inefficiencies, particularly in the selection of delivery routes, in order to design a more efficient and effective solution. Data Preparation Data preparation is a crucial step in analysis to ensure data quality and consistency. This stage consists of data understanding and data cleaning. Data understanding starts with understanding the content of the data starting from identifying the data type in each variable, the number of rows and columns, and the purpose of further analysis. Furthermore, the data cleaning stage starts by combining US Zips to Candy Sales data by marking unnecessary data and seeing if there is duplicate data or empty data. From the US Candy Distribution dataset, no duplicate data was found but there was some empty data only in the US Zips table which was then deleted because it was not needed in the analysis. Next we removed Canada from the country column and removed outliers based on With these steps, the data can be analyzed more accurately and reliably. Data Analytics Data analysis is the process of collecting, organizing, and analyzing data to gain insight, make predictions, and support decision making. This research focuses on analyzing the improvement of distribution efficiency to cities with the highest sales with the selection of areas based on a contribution of 10% of annual sales. then calculate the distance data between the factory and the destination city using the Vincenty method using Python and the geopy library using google colab. The results of this calculation are used as the basis for distribution analysis to identify the optimal strategy to improve delivery efficiency. Data Visualization Data visualization is the process of presenting data in graphical form, such as charts, diagrams, or maps using Power BI tools, to make it easier to understand patterns, trends, and relationships in data. With visualization, complex information can be simplified so that it is easier to analyze and use in decision making. Reports The final stage of the research was the production of a report that systematically organized the findings, analysis, and recommendations. Data visualization was used to support understanding, with clear and easy-to-understand language. Results and Discussion 1 Analysis Results 1 Sales and Products Sales analysis was conducted to identify the number of sales in the 5 cities with the highest sales. The results of this analysis are presented in the following table: Table 1. Highest Selling City City SUM Cost SUM of Sales SUM Gross Profit New York City Los Angeles Philadel San Francisc Seattle SUM Sales Based on available data. New York City has the largest contribution to the company's total sales, with sales value reaching $10,945. 89 and gross profit of $7,302. 53, which represents 8. 90% of total sales. The second position is occupied by Los Angeles with total sales of $9,371. 55 and gross profit of $6,269. or equivalent to 7. 62% of total sales. Furthermore. Philadelphia. San Francisco, and Seattle also made significant contributions, each contributing more than 4% to total sales. Overall, these five cities contributed 91% to the company's total sales. This finding indicates that optimizing distribution in these cities has the potential to significantly improve logistics City Product Name New York City Wonka Bar Scrumdiddlyumptious Wonka Bar Milk Chocolate Wonka Bar - Fudge Mallows Wonka Bar - Nutty Crunch Surprise Wonka Bar - Triple Dazzle Caramel Wonka Bar Kazookles Nerds SweeTARTS Hair Toffee Fun Dip Fizzy Lifting Drinks Laffy Taffy Lickable Wallpaper New York City Total Los Angeles SUM of Units Wonka Bar Milk Chocolate Wonka Bar - Triple Dazzle Caramel Wonka Bar - Fudge Mallows Wonka Bar Scrumdiddlyumptious Riska Indarwati / Int. Journal of Information Technology and Business. Vol. No. Next is the analysis on total sales with the number of products sold from 3 cities presented in the form of the following table: Table 2. Number of Products Sold in Top 3 cities Wonka Bar - Nutty Crunch Surprise Wonka Gum Kazookles Hair Toffee Los Angeles Total Philadelphia Wonka Bar - Fudge Mallows Wonka Bar Scrumdiddlyumptious Wonka Bar - Milk Chocolate Wonka Bar - Triple Dazzle Caramel Wonka Bar - Nutty Crunch Surprise Wonka Gum Kazookles Fizzy Lifting Drinks Laffy Taffy Lickable Wallpaper Philadelphia Total Grand Total Factory City Wicked Choccy' Wicked Choccy' s Total AVERAGE of Distance Vincenty SUM Units SUM Cost Seattle Sugar Shack Total Lot's O' Nuts Table 3. Factory Distance to Destination City Sugar Shack Los Angeles New York City Philadelph San Francisco distance between the factory location and the five cities with the highest sales contribution. The analysis shows that there are different product preferences in each city. Wonka Bar Scrumdiddlyumptious has the highest demand in New York City, while Wonka Bar - Milk Chocolate is more dominant in Los Angeles. Philadelphia showed higher interest in Wonka Bar - Fudge Mallows. In addition, some products such as Nerds. Fun Dip and Lickable Wallpaper recorded very low sales figures, selling less than 10 units each. This suggests the need for special marketing strategies, such as bundling with popular products or providing discounts to increase the appeal of these products. 2 Factory Distance to Destination City Calculating the distance between factories and destination cities is an important factor in analyzing distribution efficiency and logistics costs. The results of the analysis are shown in table 3. 3, which shows the New York City Philadelph San Francisco Los Angeles New York City Philadelph San Francisco Seattle Lot's O' Nuts Total Grand Total 18,989. 3,78 Based on the results of the distance analysis between factories and destination cities, it shows that Lot's O' Nuts is the factory with the largest distribution volume . ,140 unit. , despite having the farthest average distance of 10,932 km. In contrast. Sugar Shack is the factory with the closest distance . km on averag. , but only distributes 35 units. This suggests that factories with large production capacity such as Lot's O' Nuts remain a mainstay despite their high distribution costs, while closer factories need to be optimized to support cost efficiency. 3 Shipping Mode To determine which shipment is the most efficient, an analysis was conducted based on the data presented in the following table: Table 4. Shipping Mode Analysis City New York City New York City Total Los Angeles Ship Mode Standar d Class Second Class First Class Same Day COUNT A of Ship Mode Standar d Class SUM Cost AVERA GE of Cost Riska Indarwati / Int. Journal of Information Technology and Business. Vol. No. Second Class First Class Same Day Los Angeles Total Philadel Philade Total San Francis Seattle Total Standar d Class First Class Second Class Same Day Standar d Class First Class Second Class Same Day Standar d Class Second Class First Class Same Day San Francis Total Seattle The analysis of shipping mode data indicates that Standard Class is the most efficient method for the majority of shipments, with a low average shipping cost ($4. 14Ae$4. Meanwhile, the Same Day mode has the highest average cost ($4. but is used selectively for shipments requiring high time Optimizing the use of Standard Class can help the company reduce total operational costs. 2 Data Visualization The processed and analyzed data is presented in an interactive dashboard that facilitates monitoring, evaluation, and decision-making regarding product The visualization results can be seen in the following figure: Fig 2. US Candy Sales Dashboard The US Candy Sales Dashboard was created using Power BI, providing visual and comprehensive information that enables users to identify sales trends, total sales by city, total sales of the top five bestselling products, total sales by division, as well as the average distance and distribution factories. This allows for easier understanding and analysis. Conclusion Based on the results of the analysis. New York City has the highest sales rate of $10,945. 89, which indicates the high interest of customers in the product. Therefore, the focus of increasing sales needs to be directed to the other five cities with lower sales. Then Wonka Bar Ae Scrumdiddlyumptious is the highest selling product, so a more intensive promotion strategy can further increase overall sales. In addition, the consistent trend of increasing sales in the fourth quarter of each year, driven by celebrations such as Halloween, special promotions, and holidays, can be leveraged for future marketing strategy planning and product inventory. As a recommendation, optimization of distribution and marketing strategies plays an important role in Wonka Bar Ae Scrumdiddlyumptious products. More efficient distribution, such as allocating production to Wicked choccy's plant closer to New York City, could maximize the market potential in the region. addition, moving distribution to a plant closer to Seattle could also boost sales in the city with the lowest current sales rate. Marketing strategies through the provision of discounts on the best-selling products in each city as well as bundling systems on products with low sales rates proved effective in increasing consumer awareness and expanding market reach. References