COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 Predictive Maintenance of Heavy Equipment Machines using Neural Network Based on Operational Data Ahya Radiatul Kamila*1. Gerry Hudera Derhass 2. Johanes Fernandes Andry3. Francka Sakti Lee4. Very Budiyanto5. Velly Anatasia6 1,3,4,5,6 Universitas Bunda Mulia. Jl. Lodan Raya No. 2 Ancol. Jakarta Utara 14430 Program Studi Sains Data. Universitas Bunda Mulia. Jakarta Kurnia Prima Nastari Corp Program Studi Sistem Informasi. Universitas Bunda Mulia. Jakarta e-mail: *1ahyaradiatul@gmail. com, ,2gerryhudera@yahoo. com, 3janry@bundamulia. flee@bundamulia. id, 5vbudiyanto@bundamulia. id, 6vanatasia@bundamulia. Abstract Preventive maintenance is a routine maintenance strategy that aims to maximize equipment life cycle and prevent unplanned downtime which causes increased repair costs. When carrying out this maintenance, error in selecting machines need to be anticipated to avoid company losses. This research aims to reduce human error in machine selection for preventive maintenance using deep learning. The dataset used in this research is operational data of heavy equipment machine dataset from one of the palm oil companies in Indonesia with 9 independent features and 1 dependent feature. Dependent feature is a target feature contain two target classes representing effective and ineffective machines. The dataset in this study contains outlier, feature scales that are very different, and imbalanced data class. To handle outlier and standardise data scale, the Z-score method is used. Meanwhile, the over sampling method is used to handle imbalanced data classes. To obtain the best model performance, the number of epochs and two types of optimizers . dam&adama. of neural network are selected. In selecting the number of epochs, experiments were carried out using 100 epochs. This research obtained the linearity relationship between the number of epochs and accuracy with the accuracy values using Adam and Adamax optimizers were 94. 82% and 93. 11% at the 100ycEa epoch. KeywordsAi Artificial neural network. Over sampling method. Preventive maintenance. Z-score INTRODUCTION Plantation industry is an industry that engages in planting specific crops in growing media, such as soil in a suitable ecosystem. In addition, the plantation sector industry also carries out downstream activities, including processing raw agricultural products into finished or semifinished goods and marketing them to both domestic and international markets. In doing so, the industry uses science, technology, capital, and management to reduce costs and increase company revenue . Plantation industry, especially palm oil, is one of the industry that contribute to Indonesia's economic growth. This is reflected in the palm oil production capacity, which increased by more than 300% from 1990 to 2017, where Indonesia has been the world's largest palm oil producer since 2008 . This sector is one of the source of economy and welfare for population because the palm oil industry is a labor-intensive industry with large-scale production. This creates need for heavy machinery to ensure that operational can be done effectively and High productivity and human safety factors are also foundations for the use of heavy machinery in palm oil industry . Therefore, the performance of heavy equipment plays a crucial role in palm oil industry . COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 One way to maintain the performance of heavy equipment is to implement predictive maintenance strategies. Predictive maintenance involves utilizing historical and real-time data to predict potential failures and identify machines that require maintenance before damage or failure occurs . Unlike traditional preventive maintenance, which is performed on machines that are still in good condition, predictive maintenance focuses on accurately forecasting machine conditions to minimize unplanned downtime and reduce maintenance costs. The goal of predictive maintenance is to ensure maintenance activities are targeted and efficient, preventing unexpected failures while optimizing resources . This approach leverages data-driven models and advanced technologies, such as machine learning and deep learning, to analyze patterns, detect anomalies, and predict failures. By distinguishing machines that are likely to fail from those that are functioning normally, predictive maintenance ensures that interventions are carried out only when necessary, improving overall maintenance effectiveness. However, implementing predictive maintenance addresses several challenges commonly faced in traditional maintenance methods, such as time constraints, limited human resources, inadequate testing equipment, and insufficient maintenance records. Relying on human inspections alone often leads to errors due to superficial assessments and undetectable issues without specialized tools. Predictive maintenance overcomes these limitations by providing an automated and accurate system for classifying machine conditions, enabling timely decision-making and reducing operational Various studies have been conducted to design systems for effective machine failure For instance, predictive maintenance using the Internet of Things (IoT) and machine learning has been explored . Machine learning algorithms, such as K-Nearest Neighbor and Support Vector Machine, have been compared for predictive maintenance classification. Other research has utilized methods autoencoder, which the encoder and the decoder parts consist of two LSTM layers . Machine learning and deep learning algorithms are also used to predict the losses in predictive maintenance with data that obtained from the gearbox fault diagnosis simulator and machinery fault simulator . Research conducted by . utilized the unsupervised learning method with a deep learning algorithm to identify faults accurately. Apart from that, a hybrid deep learning approach was conducted by . that combines the convolutional neural network, long-short-term memory, and attention technique. The main objective of this research is to classify machines into two groups: effective machines and ineffective machines. This research contributes to the optimization of predictive maintenance methods. It employs an approach that maps data into distinct classes . utilizing a technique that enables devices to simulate human-like thinking through artificial intelligence . The dataset used in this research was obtained from oil palm plantation companies and consists of both independent and dependent variables based on operationl data. The dependent variable represents the condition of heavy equipment at a given time, categorized into two classes: damaged machines and non-damaged . However, the dataset is imbalanced, with fewer instances in the minority class. This imbalance can lead to detection errors, as the machine learning model receives less training on the minority class. To address this issue, the Synthetic Minority Oversampling Technique (SMOTE) was applied to balance the data For data processing, the Artificial Neural Network (ANN) algorithm was utilized with hyperparameter tuning. To determine optimal hyperparameter values, this research compared the performance of the model across different numbers of epochs and two optimizers. Adam and Adamax. RESEARCH METHODS In this research, the research method is divided into two main parts. The first part discusses the stages in research design. This section explains the method of formulating important points in Meanwhile, the second part discusses the stages in processing data to produce good model performance. COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 Research Design The sources used in this stage come from experts who work in the oil palm plantation sector and related literature. The stages in the research design process include several key steps. First, the interview stage involves obtaining information by collecting data related to the research problem . In addition to this, an interview with an employee working in the plantation industry, specifically in the area of mechanization cost control and financial planning, was conducted to help the researcher gain a deeper understanding of the research objectives, targets, and contributions. The second stage is the study of literature, which involves reading, citing, and analyzing relevant theories and concepts that support the research field. Finally, the research design stage focuses on selecting the appropriate research methods, data collection techniques, data analysis approaches, and the tools that will be utilized throughout the research process. Data Processing Next section of research methodology discusses the stages in processing data in Figure 1. In processing the data, the author used several stages to produce good model performance. This is done by selecting data processing stages that are relevant to the dataset used. Figure 1. Block diagram of data preprocessing and modeling stages The block diagram represents a typical machine learning workflow. It begins with the dataset input, where data is collected and loaded for analysis. Next, exploratory data analysis (EDA) is conducted to understand the characteristics of the data, identify patterns, and detect Following EDA, data preprocessing steps are performed to prepare the data for This includes data cleaning to handle missing or erroneous values, feature scaling to standardize the range of numeric features, data partitioning to split the dataset into training and testing sets, and class balancing to address any class imbalances in classification problems. After preprocessing, the modeling stage begins, where an artificial neural network (ANN) is chosen as the model. During this phase, hyperparameter tuning is performed to optimize the performance of the neural network by adjusting parameters such as learning rate, number of layers, and nodes. Once the model is trained, it is evaluated using appropriate evaluation metrics, the result is presented, summarizing the model's performance and its ability to make accurate predictions. Dataset The dataset used by the writers was provided by one of Indonesia's palm oil corporations, consisting of 124,493 rows, each representing a heavy machinery unit with 9 independent features and 1 dependent feature. The data used in this study was obtained from operational activity unit recorded in the workshop of a palm oil company in Indonesia, which has plantations in Kalimantan. These operational data encompass various aspects of machinery performance and COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 usage in the field. Every activity involving the machineryAiwhether it pertains to maintenance schedules, repair actions, or daily operational logsAiis manually recorded by technicians or equipment operators in the workshop. This workshop is fully responsible for ensuring the functionality, repair, and preventive maintenance of all heavy equipment used in plantation As a result, the dataset reflects real-world operational conditions, offering valuable insights into the machinery's health and effectiveness over time. Table 1. Dataset Features Breakdown day Standby day Cost maintenance Kilometer/days Times service Weekly check Daily check Oil leak Machine Efficiency Explanation of The Features The day of unit breaks down The day of standby . ot breaks dow. Unit repair costs Units's kilometer/day Number of units routine service Weekly checks of unit condition Daily checks of unit condition Total oil leak Target . ependent featur. An overview of the key features recorded and used in this study is presented in Table 1. the workshop, each time a machine is used or encounters an issue, the operators and technicians make detailed records of the machineAos condition, including Breakdown day . he number of days the machine experiences failure or downtim. Standby day . he number of days the machine is ready but not in operatio. , and Cost maintenance . he maintenance cost incurre. Additionally, data regarding Kilometer/days . istance or operational duration of the machin. Times service . requency of machine servicin. Weekly check and Daily check . outine weekly and daily inspection. , as well as Oil leak . etected oil leak. are also recorded every time maintenance is The data collected in the workshop is then entered into the company's management system or internal database, which can be accessed by the analytics team. This manual recording process in the workshop ensures that the information gathered is based on direct observation of the machine's condition, providing an accurate picture of how operational and maintenance conditions can affect its efficiency. The data obtained is used for predictive analysis, helping the company plan maintenance and improve operational efficiency in the plantation. Exploratory Data Analysis In this stage, the data is analyzed to obtain insights with a deeper understanding of the data using visualization. Apart from that, this stage is also carried out to identify defects in the data. This is done by looking at the relevance, distribution, outliers and data anomalies. The use of data visualization in the exploratory data analysis stage also aims to help recognize data patterns to make analysis easier . The analysis that carried out at this stage used as a reference in carrying out the next stage, data preprocessing stage. The data preprocessing stage has a big role in determining the performance of the final model. This is the main reason to carried out data analysis stage. The initial step in exploratory data analysis is analyzing the completeness of data. In this research, no features were found with NaN values, in other words, all features had complete data. After analyzing the completeness of data, we carried out an outlier data analysis, which is an analysis to find out whether there are features that contain values which deviate greatly from other This is done to prevent a deterioration in model performance caused by outlier data. this research, outlier data analysis was carried out using visualization with a boxplot as in Figure COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 Figure 2. Boxplot Figure 2 presents boxplots of the features, focusing on identifying outliers in the dataset. Outliers are typically managed in the following way after identification. First, outliers are detected using boxplots, where any data point lying outside of the "whiskers" . 5 times the interquartile range above the third quartile or below the first quartil. is considered an outlier. Once identified, it is crucial to examine whether these outliers are legitimate data points or the result of data entry errors. For instance, if Breakdown day shows an outlier of 50 days, it could either be a legitimate long-term breakdown event or a data recording mistake. Understanding the context behind the data is essential before deciding how to proceed. If an outlier is found to be an error or irrelevant, it may be removed entirely from the dataset. Alternatively, outliers can be capped to a maximum or minimum threshold based on domain knowledge. For example, if Kilometer/days shows unusually high values due to operational constraints, it could be capped at a more reasonable value. In some cases, transformations like logarithmic adjustments are applied to reduce the impact of outliers and normalize the data, particularly for features like Total km or Times service, where values can vary significantly. By managing outliers in a thoughtful and datadriven manner, the analysis remains robust, ensuring that predictions and insights derived from the data are more accurate and reliable. The next step in this stage is data scale analysis which carried out to ensure that the data scale for each feature is in the same range of value. This is done because some machine learning algorithms are sensitive of the differences of data scale. If the data features vary greatly in scale, the algorithm will assign higher weights to features with larger values. This will affect the learning process and cause a deterioration in model performance. The final step in the exploratory data analysis stage is data class imbalanced assessment that greatly influences the performance of machine learning. Model will learn more from classes with a larger amount . ajority clas. and learn less from classes with fewer numbers . inority If this happens, then there is a tendency to ignore the minority class. As a result, machine learning algorithms will tend to choose classes with a larger amount of data and will cause bias . In this research, data class imbalanced assessment was carried out using data visualization. The amount of data for the efficient heavy equipment machine class is much greater than that for the inefficient heavy machine class, with the number of efficient machine classes are 124,388 and the number of data for the inefficient machine class are 106 data. Preprocessing The next stage in data processing is data preprocessing, which is the data preparation stage before the data is processed/trained by machine learning algorithms, which involves cleaning of transforming data . In this stage, steps are taken to improve the quality of the raw data. Poor data quality can be caused by several reasons, including incomplete data, irrelevant data, imbalanced data classes, and so on. Data conditions like this will trigger poor model performance. COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 This is caused by bias, misinterpretation, sensitivity of machine learning algorithms, and so on. Therefore, a data preprocessing stage was carried out using steps to resolve each problem contained in this research. A Data Cleaning: Inappropriate data can lead to poor model performance. Therefore, in this research data cleaning stage was carried out in order to improve model performance. Data cleaning steps usually include handling empty data, irrelevant data, inappropriate data formats, and outlier data. All of these steps are used according to the problems in the The dataset used in this research has a complete, and relevant features, but there are data outliers. A Scaling Feature: The process of standardizing the scale of data for each feature. This is achieved by adjusting the value range for each feature in the dataset. The main purpose of feature scaling is to prevent the dominant influence of features with a larger value range in the data learning process. In standardizing data ranges, several methods can be used, while in this research, the method using z-score is used. This method standardizes the data scale by calculating the difference between the average future values, which is then divided by the standard deviation . It is suitable for use on datasets with outlier data. The equation . is used to calculate the z-score, which measures how many standard deviations a particular data point . is away from the mean () of the dataset. yc= A A ycuOe AA Where: Z = z-score AA = Average A = Std. Data Partitioning: The step of separating data that will be used for train and test. This is done so that the results of measuring the performance of machine learning model can represent reliable results, since the data that being tested is the data that has never been seen by machine learning model before. In this research, data was divided into 70% train data and 30% test. This 70-30 split was chosen because it provides a balanced approach that allows for sufficient data to train the model while still leaving a substantial portion for testing its performance. This ratio is often preferred when there is a moderate amount of data available, as it provides a good balance between training accuracy and testing While common alternatives, such as the 80-20 split, also work well, the 70-30 split is generally considered more conservative and helps avoid overfitting, ensuring that the model has enough data for both training and validation. Furthermore, it helps in ensuring that the model is tested on a diverse set of examples, leading to a more robust evaluation of its generalization ability. Data Class Balancing: In the case of classification, imbalanced data classes can cause the machine learning algorithm to learn more from a larger amount of data. This causes errors in the learning process by the machine learning algorithm which causes a degradation in model performance. In this research, data class balancing was used with an over sampling method. SMOTE (Synthetic Minority Over Sampling Techniqu. This method balances data classes by adding data in classes that have a smaller number so that it has a balanced number of data classes with a larger number . In adding data to the minority class, this method uses data grouping based on nearest neighbors which is calculated using eucledean distance . ycuycycycu = ycuycn . cuycoycuycu Oe ycuycn ) ycu yu, ycn = 1,2. A . , ycu Where : COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 : Synthesized data : Data to be synthesized : Data with the closest distance to the data to be synthesized : Random number . etween 0-. Modeling In machine learning, modeling refers to the process of building mathematical equation to represent the phenomenon based on data that has been provided using machine learning This model is then used to carry out predictions, classification, optimization, etc. The selection of machine learning algorithm in model building really depends on the objectives and dataset used. Choosing the wrong algorithm can lead to poor model performance. In this research, an artificial neural network algorithm was used as an algorithm which is functioned to classify the condition of heavy equipment machines. Artificial neural network algorithms are part of deep learning whose way of working is inspired by the way of human brain works . Brain is an humanAos organ consisting of a collection of interconnected nerves that can be used to learn using examples. Likewise artificial neural networks, this algorithm can also learn using examples of data provided and has interconnected 'neurals' which are represented using mathematical calculations. Figure 3. Architecture Neural Network The human brain nerves that were the inspiration for the creation of this algorithm are interconnected sending elements called 'neurons'. Neurons in artificial neural networks are interconnected elements whose function is to transmit information through synapses . lectromagnetic connection. which are then used to solve problems. The connections between neurons then form layers. In each connection, each neuron has its own weight. Weight represents the influence or strength of connections between neurons . In artificial neural network algorithms, there are elements called 'parameters' and 'hyperparameters'. Where parameters are variable configurations used to influence and control the artificial neural networks behavior and Meanwhile, hyperparameters are part of the parameters that are usually determined before the data training process takes place . The learning process in artificial neural networks involves two main stages: feedforward and backpropagation. In the feedforward stage, data flows through the network, starting from the input layer, passing through the hidden layers, and finally reaching the output layer . During this process, each neuron performs mathematical operations by multiplying the inputs with corresponding weights, adding biases, and applying an activation function to generate an output. This forward pass predicts an output based on the current parameter settings. However, the predicted output is rarely perfect in the initial stages. COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 address this, the backpropagation process is utilized to minimize errors. Backpropagation adjusts the weights and biases of the network based on the difference between the predicted output and the actual target, as measured by a loss function. This adjustment is achieved by propagating the error backward through the network, layer by layer, using the gradient of the loss function with respect to each weight. Optimizers, are then used to update the parameters to minimize the loss In this research, comparison was made of two hyperparameters in the artificial neural network algorithm, including the number of epochs and the type of optimizer. In deep learning, epoch is a representation of the iterations of the training dataset. In each epoch, the parameters . eights and bia. are updated by algorithm. An optimal number of epochs required to produce good model performance, depending on complexity, dataset size, and the type of task the model is performing. Too few epochs can lead to poor model performance. This happens because the model does not have enough learning iterations to be able to adjust the weight and bias parameters On the other hand, too many epochs can lead the model to adapt 'noise' contained in the training data, which then lead to an over fitting. This comparison is carried out to determine the number of epochs that is most suitable for handling the dataset used. So, no under-fitting or overfitting model performance is produced. Meanwhile, hyperparameter optimizer in this algorithm is an algorithm used to adjust parameter values . eights and bia. Optimizer helps the model to learn from training data by measuring the weight and bias values with the aim of reducing loss function . The choice of optimizer type is based on research case. This plays an important role in the model performance results. There are several types of optimizers, one of which is adam and adamax. The performance of this second optimizer will be explained in this research. Adaptive Moment Estimation (Ada. Optimizer Adam optimizer applies adaptive learning rate calculations to each parameter and performs calculations on individual adaptive learning rates on different parameters to estimate the first and second moments of the gradient . This algorithm is a combination of two algorithms. Adaptive Gradient (Adagra. and Root Mean Square Propagation Algorithm (RMSPro. Ayc 1 = Ayc Oe With : ycaCyc Cyc OO Ooyc ycaCyc yca = 1Oeyuyc yc ycCyc Where : A ycaCyc A ycCyc ycyc 1Oeyu2yc : Exponentially decaying average of previous gradients. : Exponentially decaying average from the square of the previous gradient. Adamax Optimizer The basic principle of this algorithm adapt the learning rate adaptively during training to improve model convergence. This algorithm has the advantage of dealing with varied gradient Apart from that, the advantage of Adamax also lies in the method of improving the learning rate during training. The difference between the adam and adamax algorithms lies in the use of the infinite norm of the moving average of the squared gradient parameters . Ayc 1 = Ayc Oe yc ycaCyc Where : A ycyc : Infinity norm COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 RESULTS AND DISCUSSIONS Final stage of this research is data evaluation stage which aims to test and measure the results of classification carried out by the machine . In evaluating results, this research used an accuracy metric and a loss function. This aims to determine the model's ability to predict data classes correctly. Equation . is a formula for calculating accuracy. ycaycaycaycycycaycayc = ycNycE ycNycA ycu 100% ycNycE ycNycA yaycE yaycA Where : A TP : True positive A TN : True negative A FP : False positive A FN : False negative In artificial neural networks algorithm, the calculation of loss function is a method for evaluating model's ability to predict data correctly. This function calculates the error in each training result. Loss value represents the extent to which the algorithm's prediction results are compared with the actual value. In artificial neural networks, there are various types of loss The selection of this function depends on the research case, such as in the cases of regression and classification, different loss functions are used. In this case, the loss function used was binary cross entropy loss or commonly known as the logarithmic loss function. Since the binary cross entropy is a loss function used in classification cases with two class categories. 1 Experimental Results using Adam and Adamax Optimizer with 100 Epochs Table 2 is the result of loss function calculations using binary cross entropy and accuracy at each multiple of 10 epochs. Figure 3 is a graph that represents the overall results of the loss function calculation and the accuracy value of 100 epochs. Table 2 Loss and Accuracy Values using the Optimizer at Epoch 1-100 Epoch Adam Optimizer Loss Accuracy Adamax Optimizer Loss Accuracy Figure 4. Comparison graph of loss and accuracy using Adam and Adamax Optimizer Lower loss values indicate better model performance, demonstrating convergence. This is because when calculating the loss function, where this research uses binary cross entropy loss, a COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 ISSN: 1978-1520 higher loss value indicates that the prediction results produced by the model are farther from the actual value. Equation . represents the calculation of binary cross entropy as a loss function. yaycuycyc = Oe ycA OcycA C) . Oe y. Oe ycC)] ycn=1. Where : A yc = Actual value A ycC = Prediction result value There are two parts of equation . The first part is a calculation that indicates errors in classifying values of positive class. In equation . , the loss calculation is obtained from multiplying the actual and logarithmic values of the machine's predicted values. In this equation, the smaller loss value is obtained from the prediction value which is closer to the positive class . r in this case closer to the data class with binary . On the other hand, if the prediction result is close to 0 . egative clas. , the resulting loss value will be greater. This indicates the accuracy of predictions in the positive class. Oe. c y log. cC)) In contrast to equation . , equation . is a loss value calculation that focuses on the negative class of the dataset. In equation . , the smaller loss value is obtained from the machine prediction value which is getting closer to the negative class . r in this case closer to the data class with binary . Vice versa, the greater the loss value is obtained from the prediction value which is closer to 1 . ositive clas. In this way, calculation results are obtained that indicate the model's ability to predict the negative class. Oe y. Oe ycC)) . Apart from loss calculations, accuracy calculations are also used in model evaluation in this research. This is because accuracy calculations indicate the model's ability to classify data into the correct data class groups. It can be seen in Figure 3 that the accuracy value is still moving up at 100 epochs. This indicates that the machine is still learning well at that epoch. From the results of training data using deep learning algorithm, it is evident that the epoch value is inversely proportional to the loss calculation results. However, this is not the case for accuracy value, which shows a directly proportional relationship to the number fo epochs. Table 2 illustrates that in the initial epoch, the loss value is high and the accuracy calculation value is low. This is because in the first iteration, the model has not yet effectively learned the data, indicating that the number of epoch is insufficient for the model to make accurate predictions. In figure 3 can also be observed that the performance model with adam and adamax The comparative analysis between these optimizers yielded interesting results. In this case. Adam optimizer outperformed adamax with a higher accuracy rate. This results suggest that Adam optimizer is more suitable for the machine efficiency classification problem in the dataset The reasons can vary, but one of the main factors may related to AdamAos ability to adaptively adjust the learning rate based on the first and second moments of the gradient. Thus. Adam can be more efficient in finding optimal parameters to minimize the loss function. CONCLUSION An approach using deep learning algorithms aimed at improving model performance by comparing optimizer and epoch hyperparameters is proposed in this study. The evaluation of model performance comparing Adam and Adamax optimizers showed that the model using the 239 ISSN: 1978-1520 Adam optimizer achieved higher accuracy and lower loss. This suggests that the Adam optimizer performed better for this dataset, which was used for classifying machine efficiency. Additionally, adjusting hyperparameters such as the number of epochs had a significant impact on the final model outcome, as indicated by the increase in accuracy with more epochs. Therefore, selecting the right optimizer and determining the optimal number of epochs are crucial for improving model For future work, it is recommended to explore additional optimizers, such as RMSprop and Adagrad, to assess their effectiveness in improving model performance. Furthermore, investigating more complex combinations of hyperparameters, including learning rate schedules and batch sizes, could provide further insights into optimizing deep learning models for machine efficiency classification. This research serves as a guide for researchers and practitioners to make more informed decisions in selecting and fine-tuning hyperparameters when developing deep learning models in similar contexts. COGITO Smart Journal Ae Vol. No. December 2025. P-ISSN: 2541-2221. E-ISSN: 2477-8079 REFERENCES