International Journal of Electrical and Computer Engineering (IJECE) Vol. No. October 2025, pp. ISSN: 2088-8708. DOI: 10. 11591/ijece. Optimizing vehicle selection in supply chain management with data-driven strategies Imane Zeroual. Jaber El Bouhdidi SIGL Laboratory. ENSATE. Abdelmalek Essadi University. Tetouan. Morocco Article Info ABSTRACT Article history: Logistics has undergone significant transformation to address the complex economic, social, and environmental challenges of the modern era. maintain competitiveness, logistics providers have been compelled to optimize operations, meet increasing customer expectations, and improve Critical issues impacting logistics performance include traffic congestion, infrastructure limitations, rising demand, and the complexities of vehicle scheduling, coordination, and management. These challenges frequently disrupt delivery operations, undermining efficiency and overall system performance. This paper proposes the application of three machine learning models aimed at optimizing delivery processes, with a focus on improving vehicle assignment for order deliveries. By leveraging these models, logistics providers can enhance decision-making and operational The study defines the core problem and evaluates several machine learning approaches to bolster logistics delivery systems. Received Feb 2, 2025 Revised Jun 20, 2025 Accepted Jun 30, 2025 Keywords: Artificial intelligence City logistics Decision trees Logistic regression Neural networks Urban logistics This is an open access article under the CC BY-SA license. Corresponding Author: Imane Zeroual SIGL Laboratory. ENSATE. Abdelmalek Essadi University Tetouan. Morocco Email: imane. zeroual1@etu. INTRODUCTION The supply chain industry encounters numerous challenges . , with logistics being a central concern, especially for distributors and transporters. Although the logistics sector has experienced rapid growth, it has also exacerbated social and environmental issues, including severe traffic congestion and increased pollution in urban areas . Addressing these challenges necessitates collaboration among all stakeholders and the implementation of effective strategies to mitigate their impacts. This paper proposes an innovative solution that leverages scientific research and artificial intelligence techniques to optimize urban freight transportation planning and infrastructure. Specifically, it introduces several machines learning models, including multinomial logistic regression, to predict the most suitable vehicle for order deliveries. Additionally, the study compares two other models to identify the best approach for the scenario. The paper emphasizes the social and environmental criticism surrounding delivery transport . , often due to inefficient routes and lack of adaptability in planning. By focusing on selecting the appropriate vehicle for deliveries, this research aims to reduce financial burdens and environmental impacts, thereby improving the efficiency and precision of urban freight operations. METHOD Architectural overview of the proposed solution The proposed architecture, illustrated in Figure 1, outlines several key stages for evaluating machine learning models in logistics delivery systems. These stages are designed to ensure effective data preparation. Journal homepage: http://ijece. ISSN: 2088-8708 model training, evaluation, and deployment. The architecture represents a structured pipeline that facilitates the extraction, processing, and analysis of historical data to optimize decision-making in delivery operations. LetAos break down each step in detail: Figure 1. Machine learning-based architecture for optimizing logistics delivery systems Data extraction from historical data: The first step involves extracting relevant data from the historical logistics records database. This data serves as the foundation for model trains, including delivery times, vehicle performance, order characteristics, and other factors that influence logistics operations. Dataset preprocessing: Once the data is collected, it is processed to ensure its quality and consistency. This stage involves: Oe Handling missing data: Missing or incomplete entries are addressed through imputation or removal . Oe Removing duplicates: Duplicate records are eliminated . to prevent bias and ensure data integrity. Oe Data transformation: Such as normalization and encoding . is applied to make the data suitable for machine learning models. Dataset splitting: To evaluate model performance effectively, the dataset is split into two parts: a training set and a test set. The training set is used to train the machine learning models, while the test set is kept separate for evaluating the final model performance. Feature selection: Feature selection is performed to identify the most relevant features for model training. This step is applied exclusively to the multinomial logistic regression model, using correlation-based feature selection . to reduce dimensionality by selecting features that are most strongly correlated with the target variable . , delivery duration, material shippe. Int J Elec & Comp Eng. Vol. No. October 2025: 4899-4906 Int J Elec & Comp Eng ISSN: 2088-8708 Modeling: This stage involves training three different machine learning models to evaluate their performance in predicting optimal vehicle assignments for order deliveries. The models used are: Oe Multinomial logistic regression: A statistical model that predicts the probability of a vehicle being suitable for an order based on multiple features . Oe Decision tree: A tree-based algorithm that splits data into branches . , predicting the best vehicle for delivery based on decision rules. Oe Neural network perceptron: A deep learning model that uses interconnected layers of neurons to learn complex patterns in the data . Testing and evaluating: After training the models, their predictions are tested against the actual outcomes . rue label. from the test set. The evaluation process involves: Oe Comparison of predictions with true labels: Assessing how well the modelsAo predictions align with the actual data. Oe Evaluating metrics: Common performance metrics, such as accuracy, precision, recall, and F1-score, are used to assess model effectiveness. Oe Cross-validation: To ensure model reliability, cross-validation is performed to prevent overfitting . and assess the modelsAo performance on unseen data. Deployment: The top-performing model is then deployed to a production environment, following these Oe Creating a REST API: A RESTful API is developed to enable seamless communication between external systems and the model, allowing for real-time data input and instant output of predictions. Oe Production deployment: The model is integrated into the logistics system, facilitating real-time predictions and optimizing vehicle assignments for deliveries. Data exploration and preprocessing When an order is confirmed as feasible for delivery, it is important to define its profile. This profile serves as a detailed description of the order, helping to identify the most suitable delivery vehicle. Defining the order profile involves considering various factors, including the characteristics of the goods . uch as volume, weight, and natur. , the target customer, the expected delivery timeline, and the delivery typology . , the type of delivery service required, such as urgent, scheduled. By understanding these factors, logistics professionals can better match the specific requirements of the order with the available vehicles during the delivery planning process. To identify the appropriate vehicle for the delivery, it is necessary to define both input and output variables that represent the relevant parameters and characteristics. These inputs form the basis for deciding which vehicle is most suited for transporting the goods. Some key parameters to consider include: Nature of the goods: The type of goods to be transported significantly impacts the vehicle selection. Certain goods, like hazardous materials, or fragile products, require specific storage, handling, and transportation conditions. A vehicle that does not meet these requirements could damage or even render the goods unfit for delivery. Delivery distance: The distance between the origin and destination directly affects vehicle capacity, fuel efficiency, and delivery time. Longer distances may require vehicles with higher fuel capacity or those more suited for long-haul trips. Delivery duration: The time required to transport the goods from the start point to the endpoint can influence the choice of vehicle. For instance, a time-sensitive delivery may require a vehicle that can travel faster or can navigate through areas with less congestion. The input variables are the parameters and characteristics of the delivery mentioned above, while the output variable is the suitable vehicle for the delivery. The proposed model is applied to theAy Delivery Truck TripAy dataset, which contains delivery trip information for each order, including the type of vehicle that is suitable for each delivery. The dataset is divided into two parts: the training set, used to train the model and minimize errors, and the test set, used to evaluate the performance of the trained model. Before feeding the dataset into the model, we noticed that some features were not normally distributed. As a result, standardization of the dataset is necessary . We apply the StandardScaler preprocessing technique to standardize the data . , using the . ycuA = ycuOeyuN yua Where: ycu: is the training sample yua: is the standard deviation of the training samples yuN: is the mean of the training samples Optimizing vehicle selection in supply chain management with A (Imane Zeroua. A ISSN: 2088-8708 To focus on the most relevant data for our model, we selected several key columns, as shown in Table 1. These include AuDeliveryIDAy. AuBookingIDAy. AuDateAy. AuDestination locationAy. AuOn-time delayAy. AuTrip start dateAy. AuTrip end dateAy. AuTransformation distanceAy. AuMaterial shippedAy, and AuVehicle typeAy. Additionally, we created a new column. AuTransformation durationAy, which calculates the time taken for each delivery by subtracting the AuTrip start dateAy from the AuTrip end dateAy. Before applying the model, we performed data cleaning by first identifying and removing any rows with missing or incomplete values. This step was crucial to ensure that the dataset used for training the model is both complete and reliable, preventing any issues that might arise from missing data . Additionally, we filtered out negative values in the Transformation duration column, ensuring that it only contains valid, positive values. The AuMaterial shippedAy column consists of various categories, which we encoded into binary columns using the OneHotEncoder . Similarly, the AuVehicle typeAy column includes multiple categories, which we converted to numerical labels using the LabelEncoder . These numerical labels will be used as our target variable . After cleaning and preprocessing the data, we split it into two sets: (X), which contains the feature variables . nput variable. , and . , which represents the target variable AuVehicle typeAy. To avoid any overlap, we removed the AuVehicle typeAy column from (X), ensuring that it remains exclusively in the . set, which is used for predictions. Next, we used Scikit-learnAos train test split function to divide the dataset into two parts: a training set and a test set. The training set is used to fit the model, enabling it to learn the relationships between the features and the target variable. The test set is then used to evaluate the modelAos performance by comparing its predictions to the actual values, providing a measure of how well the model generalizes to new, unseen data. Table 1. Column value examples from the logistics dataset Column name Transportation distance Type Numerical Values . Transportation duration Material shipped Vehicle type Numerical Categorical Categorical TOOL KIT SET 21MT, 40 FT 3XL Trailer 35MT Description The distance traveled to move goods from the origin to the destination . The time taken to move goods from the origin to the destination . The type of material that has been delivered The type of vehicle used for transporting goods RESULTS AND DISCUSSION Multinomial logistic regression for delivery vehicle prediction To make predictions, we decided to use multinomial logistic regression, a supervised machine learning model. This model is a classification technique that extends logistic regression to handle categorical dependent variables with more than two possible outcomes . We selected this algorithm because our target variable. AuVehicle typeAy, consists of multiple classes. After training the model, we assessed its performance by calculating the accuracy, which represents the proportion of correct predictions relative to the total predictions. In our case, the model achieved 76% accuracy on the training set and 60% accuracy on the test set. The significant difference . ver 10%) between the training and test accuracies suggests potential overfitting. Overfitting occurs when the model becomes too specialized in memorizing the training data, leading to high performance on the training set but poor generalization to new, unseen data . , the test se. As a result, the model performs worse on the test set, which lowers the test accuracy. Multi-layer perceptron for delivery vehicle prediction Another machine learning algorithm we used to predict the suitable vehicle for delivery is the multilayer perceptron (MLP), a type of neural network model that is particularly well-suited for classification problems . The preprocessing steps for our dataset, as described earlier, are the same as those used for the multinomial logistic regression model. For training, the MLP model utilizes two hidden layers, as illustrated in Figure 2. The first hidden layer contains three neurons, while the second hidden layer has two neurons. The algorithm processes the input data, applies transformations, and then passes the results to the output layer, which generates the final prediction. To assess the performance of the multilayer perceptron model, we evaluated it using several classification metrics, as illustrated in Figure 3, for each of the three classes . , 34, and . These metrics include precision, recall. F1-score, and support . he number of instances of each class in the datase. These metrics provide a comprehensive understanding of how well the model predicts the correct vehicle type for Int J Elec & Comp Eng. Vol. No. October 2025: 4899-4906 Int J Elec & Comp Eng ISSN: 2088-8708 Figure 2. MLP classifier model for vehicle type prediction using transportation data Figure 3. Classification report for multi-layer perceptron Precision is the ratio of correctly predicted positive instances over the total predicted positive instance, as shown in . ycEycyceycaycnycycnycuycu: ycNycycyce ycEycuycycnycycnycyceyc ycNycycyce ycEycuycycnycycnycyceyc yaycaycoycyce ycEycuycycnycycnycyceyc Recall is the ratio of correctly predicted positive instances over the total actual positive instances . , as shown in . ycIyceycaycaycoyco: ycNycycyce ycEycuycycnycycnycyceyc ycNycycyce ycEycuycycnycycnycyceyc yaycaycoycyce ycAyceyciycaycycnycyceyc F1-score is the harmonic mean of precision and recall, and it provides a balance between precision and recall . , as shown in . ya1 Oe ycycaycuycyce: cEycyceycaycnycycnycuycu yycIyceycaycaycoyc. ycEycyceycaycnycycnycuycu ycIyceycaycaycoyco Micro average of precision, recall, and F1-score is a weighted average that accounts for class imbalance by considering the total number of instances across all classes . In contrast. Macro average is an unweighted average of precision, recall, and F1-score, treating each class equally regardless of its size. our case, the modelAos performance is as: weighted average precision: 0. 58, recall: 0. 80, and F1-score: 0. These values indicate moderate performance overall. The relatively high recall of 80% suggests that the model is effective at correctly identifying positive instances. However, the lower precision of 58% indicates that only about 58% of the instances predicted as positive are correct, meaning the model has a relatively high false positive rate. For individual classes: Oe Class 34 shows the highest precision and recall, indicating the model performs best for this class. Oe Class 30 has the highest recall but the lowest precision, suggesting the model is more prone to false positives for this class. Oe Class 36 exhibits both the lowest precision and recall, signaling poor performance for this class. Optimizing vehicle selection in supply chain management with A (Imane Zeroua. A ISSN: 2088-8708 Decision tree algorithm for delivery vehicle prediction The final model explored in this section is the decision tree, a supervised machine learning algorithm commonly used for classification tasks. This model is particularly useful for problems where the goal is to assign inputs to one of several predefined classes based on certain features . We trained the decision tree on the same dataset as the previous models and generated the classification report, as shown in Figure 4. The results from the classification report, based on the test data, reveal the following performance metrics for the decision tree classifier: Precision: 66%. Recall: 82%, and F1-score: 68%. Precision measures the accuracy of positive predictions, with the model correctly predicting 66% of positive instances. Recall reflects the modelAos ability to identify actual positives, correctly identifying 82% of The F1-score, the harmonic mean of precision and recall, is 68%, offering a balanced view of While the model has high recall 82%, indicating it detects most positive instances, its lower precision 66% suggests it misclassifies some negatives as positives. In summary, the classifier has a weighted average precision of 66%, recall of 82%, and F1-score of 68%, indicating good detection of positives but with some false positives. The three machine learning models tested in this study . ultinomial logistic regression, decision tree, and multilayer perceptro. provide valuable insights into predicting the most suitable vehicle for delivery assignments. However, all models encountered challenges related to both underfitting and overfitting, largely due to the limited size and complexity of the dataset. Underfitting was observed when the models failed to capture the underlying patterns in the data . , while overfitting occurred when the models learned the training data too well, including noise and irrelevant details . This led to performance inconsistency, with the models performing significantly better on the training data than on the test data, which is a common symptom of overfitting. To gain a more comprehensive understanding of model performance, we used precision, recall, and F1-score as additional evaluation metrics, alongside accuracy. Accuracy alone can sometimes be misleading, especially in multi-class classification tasks, where class imbalance may skew the modelAos true performance. To reduce overfitting, we adjusted the multinomial logistic regression model by applying cross-validation, which helps better assess its generalization to unseen data, as shown in Figure 5. Despite these adjustments, the performance of the models could be further improved by applying techniques such as hyperparameter tuning . , ensemble learning, and training on larger, more diverse datasets. Future work could also explore methods like early stopping . and dropout during training . to enhance generalization and reduce Figure 4. Classification report for decision tree model Figure 5. Cross-validation accuracy for multinomial logistic regression CONCLUSION The integration of artificial intelligence in logistics has the potential to significantly optimize processes by analyzing data and making real-time predictions and decisions. This capability can help carriers enhance the profitability of their operations in urban areas, improve overall performance, and ultimately increase customer satisfaction and loyalty. In this paper, we experimented with three machine learning Int J Elec & Comp Eng. Vol. No. October 2025: 4899-4906 Int J Elec & Comp Eng ISSN: 2088-8708 models to tackle a classification problem, aiming to identify the most suitable model based on various evaluation metrics. In addition to model selection, the quality of the data plays a crucial role. Effective data preprocessing such as cleaning, transformation, and preparation ensures that the data is properly structured for the algorithm, which helps improve the learning process. As a final note, future research should focus on incorporating environmental factors into logistics optimization models. This could include enriching the dataset with criteria like CO2 emissions and fuel consumption, which are essential for promoting sustainable logistics practices. FUNDING INFORMATION Authors state no funding involved. AUTHOR CONTRIBUTIONS STATEMENT This journal uses the Contributor Roles Taxonomy (CRediT) to recognize individual author contributions, reduce authorship disputes, and facilitate collaboration. Name of Author Imane Zeroual Jaber El Bouhdidi C : Conceptualization M : Methodology So : Software Va : Validation Fo : Formal analysis ue ue ue ue ue ue ue ue ue ue I : Investigation R : Resources D : Data Curation O : Writing - Original Draft E : Writing - Review & Editing ue ue ue ue ue ue ue ue Vi : Visualization Su : Supervision P : Project administration Fu : Funding acquisition CONFLICT OF INTEREST STATEMENT Authors state no conflict of interest. DATA AVAILABILITY Derived data supporting the findings of this study are available from the corresponding author. IZ, on request. REFERENCES