International Journal of Electrical and Computer Engineering (IJECE) Vol. No. August 2025, pp. ISSN: 2088-8708. DOI: 10. 11591/ijece. Deep feature representation for automated plant species classification from leaf images Nikhil Inamdar1. Manjunath Managuli1. Uttam Patil2 Department of Electronics and Communicatios Engineering. KLS Gogte Institute of Technology Belagavi and Affiliated to Visweravaya Technological University. Belagavi. India Department of Computer Science and Engineering. Jain College of Engineering. Belagavi and Affiliated to Visweravaya Technological University. Belagavi. India Article Info ABSTRACT Article history: Automated plant species classification using leaf images holds immense potential for advancing agricultural research, biodiversity conservation, and ecological monitoring. This study introduces a novel approach leveraging deep feature representation to achieve accurate and efficient classification based on leaf morphology. Convolutional neural networks (CNN. , including VGG16. ResNet50. DenseNet1. Inception, and Xception, are employed to extract high-level features from leaf images, capturing intricate patterns essential for species differentiation. To manage the extensive feature set extracted by these models, optimization techniques such as principal component analysis (PCA), variance thresholding, and recursive feature elimination (RFE) are applied. These methods streamline the feature set, making the classification process more efficient. The optimized features are then trained using classifiers like support vector machine (SVM), k-nearest neighbors (K-NN), decision trees (DT), and naive Bayes (NB), achieving average accuracies of 98. 6%, 96. 6%, 99. 6%, and 99. 7%, respectively, across various cross-validation methods. Experimental results on benchmark datasets demonstrate the effectiveness of this approach, achieving state-ofthe-art performance in plant species classification. This work underscores the potential of deep feature representation in automated plant species classification, offering valuable insights for applications in agriculture, ecology, and environmental science. Received Aug 23, 2024 Revised Apr 15, 2025 Accepted May 24, 2025 Keywords: Convolutional neural networks Plant species classification Deep learning Feature representation Leaf images Transfer learning This is an open access article under the CC BY-SA license. Corresponding Author: Nikhil Inamdar Department of Electronics and Communication Engineering. KLS Gogte Institute of Technology Belagavi and Affiliated to Visweravaya Technological University Jnashodha Campus Udyambhag. Belagavi India Email: nikhil0870@gmail. INTRODUCTION Detecting plant diseases through image analysis is essential in precision agriculture. Traditionally, disease severity has been assessed visually by experts . Although the use of digital cameras and advancements in information technology has increased the adoption of expert systems in agriculture, enhancing plant production . , these systems still face significant challenges. Various artificial intelligence (AI) techniques, including K-nearest neighbors (K-NN), logistic regression, decision trees, support vector machines (SVM) . , and convolutional neural networks (CNN), are extensively used for plant disease detection and classification. These methods often involve image preprocessing to enhance feature extraction. K-NN, a supervised algorithm, classifies by comparing Journal homepage: http://ijece. ISSN: 2088-8708 similarities between unlabeled and labeled data . Decision trees, structured like a flowchart, make decisions based on attributes at each node but can face challenges such as overlapping nodes and overfitting . Sharif et al. Introduced a technique for detecting diseases in citrus plants using texture features, combining principal component analysis with feature statistics for hybrid feature selection. Patil and Kumar . developed a system to detect diseases in soybean leaves by analyzing color, shape, and texture features using a content-based image retrieval approach. Sandika et al. proposed a method for identifying grape leaf diseases in complex backgrounds, utilizing features like local binary patterns (LBP) and RGB color statistics, and applying machine learning algorithms such as support vector machines (SVM) and random forest. The primary research problem addressed in this study is the accurate and efficient classification of plant species using leaf images, overcoming the limitations of traditional methods that rely on manual feature extraction and conventional machine learning techniques. Our unique approach leverages deep feature representation by integrating multiple advanced CNN architectures (VGG16. ResNet50. DenseNet121. Inception, and Xceptio. to capture intricate patterns essential for species differentiation. Additionally, we employ sophisticated feature optimization techniques such as principal component analysis (PCA), variance thresholding, and recursive feature elimination (RFE) to streamline the feature set, reducing computational complexity and improving efficiency. This combination of deep feature extraction and optimization, along with the use of diverse classifiers (SVM. K-NN, decision trees, naive Baye. , results in state-of-the-art performance with accuracies up to 99. RELATED STUDY Deep learning architectures have recently shown great effectiveness in tasks like object identification, classification, and segmentation, with CNNs being particularly prominent . For example. Selvaraj et al. developed a dataset of banana plant samples from Africa and Southern India, covering 17 diseased classes and one healthy class. Using CNN architectures like ResNet50. InceptionV2, and MobileNetV1, they achieved a 90% accuracy rate. Lu et al. explores various classifiers, including SVM. KNN, and Random Forest, for classification. The study . underscores the adaptability of CNN-based feature extraction and the potential of different classifiers in accurately classifying agricultural data. Barbedo . explores 79 disease detection of 14 different plant species. introduces an optimal feature set for achieving higher classification accuracy in agricultural crop species classification. By combining various features and datasets, the study aims to further optimize classification accuracy and explore different feature combinations to enhance performance. proposes an intelligent system for real-time identification of Indian medicinal herbs based on leaf images, utilizing Raspberry Pi. The developed machine learning models achieve impressive accuracy rates, demonstrating the feasibility of using low-cost hardware for real-world applications in plant identification. Introducing a CNN based method called D-Leaf for leaf classification, . compares different CNN models based on their feature extraction capabilities. The D-Leaf model achieves competitive testing accuracy compared to pre-trained CNN models. Chuanlei et al. achieved 97% accuracy in their experiments on wheat plants, using a dataset with six diseased classes and one healthy class. Similarly. Singh et al. developed a system for detecting tea leaf diseases with a modified deep convolutional neural network, achieving an average accuracy of 92%. Chakraborty et al. conducted experiments on 79 different diseases across 14 plant types using the GoogLeNet architecture, with accuracy scores consistently above 75%. Krizhevsky et al. explored various CNN architectures, achieving up to 99% accuracy, with VGG reaching 81% across multiple plant types. Geetharamani and Pandian . worked with a dataset containing 38 classes from 14 plant types, attaining a 96% accuracy rate. Traditional machine learning approaches, though effective in plant disease identification, are limited by the sequential nature of image segmentation . , feature extraction . , and pattern recognition . While basic CNN models like AlexNet. VGGNet. GoogLeNet. DenseNet, and ResNet have been extensively utilized for plant disease classification, they come with drawbacks such as high parameter demands and slow computation speeds. Despite their strength in capturing both high- and low-level features, these models often struggle with consistently describing local spatial characteristics . Implemented residual learning framework to ease the training mechanism there is 28% relative improvement in COCO object detection dataset. METHODOLOGY Block schematic of the proposed method is shown in Figure 1. It consists of four stages, namely, pre-processing, deep feature extraction, feature optimization and classification. The research focuses on classification of plant leaf images, many researchers have worked on identification of leaf diseases rather than on identification leaftypes specifically growing in this part of the country and also proved to have rich Int J Elec & Comp Eng. Vol. No. August 2025: 3759-3768 Int J Elec & Comp Eng ISSN: 2088-8708 business value. There exist many varieties of plants. the article proposes 14 plant leaf classification methodology using deep features. Datasets of 14 plants are collected from different research work cited in the literature survey section. Commonly and widely used CNN are used for deep feature extraction, namely. VGG16. Resnet50. InceptionV3. Xception and Densenet121. Models such as VGG16, pre-trained on large datasets like ImageNet, offer a transferable set of features that capture rich visual information from images. ResNet enables effective gradient flow and facilitates the learning of highly abstract features throughout the network layers. InceptionV3 and its variants leverage inception modules, which use multiple convolutions of different kernel sizes within the same layer. Xception, inspired by Inception architecture, replaces traditional convolutions with depth wise separable convolutions. This modification decouples spatial and channel-wise correlations in feature maps, leading to improved feature representation with fewer parameters. Figure 2 shows the use of VGG16 model for deep feature extraction and its relevant approximate breakdown of the number of features extracted at each layer: Input layer: this layer doesn't produce features directly, but it accepts input images of size . , 224, . Convolutional layers: VGG16 has a total of 13 convolutional layers . ncluding pooling layer. Each convolutional layer typically outputs feature maps of varying sizes, gradually reducing spatial dimensions while increasing depth . umber of filter. Fully connected layers: after flattening. VGG16 includes three fully connected layers with decreasing numbers of neurons: 4096, 4096, and 1000 . or ImageNet's 1000 classe. 25,088 represent the feature vector extracted from the last convolutional layer before feeding into fully connected layers or Having various CNN models available in the literature, an experiment was conducted to analyze the behavior of different CNN models and the number of features they extract. Table 1 provides detailed performance metrics for these models both with and without feature optimization. Figure 1. Block schematic of the proposed study Principal component analysis (PCA) It is a method used for dimensionality reduction, widely applied in fields like image processing, data visualization, and machine learning. Mathematically. PCA begins by standardizing the data to ensure uniformity across variables. Then, it computes the covariance matrix of the standardized data: ycaycuycycaycycnyceycuycayce = . cuOe. cU ycN ycU) . Here, ycU is the ycu y ycy matrix of the standardized data. Deep feature representation for automated plant species classification from leaf images (Nikhil Inamda. A ISSN: 2088-8708 PCA then derives the eigenvectors yc1 , yc2 . A , ycycy and their corresponding eigenvalues 1, 2. A, p arranged in descending order. These represent the directions of maximum variance and their magnitudes. Figure 2. Deep feature extraction methodology Table 1. Number of features extracted from different CNN models considered Original number of features Reduced number of features after PCA Accuracy after PCA: Reduced number of features after variance Accuracy after variance thresholding Number of features after preliminary PCA Reduced number of features after RFE Accuracy after RFE: Accuracy without optimization: VGG16 RESNET50 INCEPTIONV3 DENSENET121 XCEPTION Variance thresholding Variance thresholding is employed as a feature selection technique to enhance the performance of the random forest classifier by eliminating features with low variance that are deemed less informative for classification tasks. The method hinges on the premise that features exhibiting minimal variance across samples are less likely to contribute meaningfully to distinguishing between different classes. For each feature f in the dataset, the variance yuayce2 is computed as the average squared deviation from the mean: yuayce2 = Ocycuycn=1. cuycn,yce Oe ycuIyce )2 ycu Here, ycuycn,yce represents the value of feature yce in sample ycn, . cuIyce ) denotes the mean of feature f across all samples, and n signifies the total number of samples. Recursive feature elimination (RFE) RFE is a feature selection technique that systematically removes attributes to improve model performance and interpretability. In RFE, a machine learning model is trained on the dataset, and features are recursively pruned based on their importance until the optimal subset is determined. The process begins by fitting a machine learning model . n our case, a random forest classifie. to the dataset with an initial set of features, which may be reduced using principal component analysis (PCA) to manage high-dimensional data ycUycyycayca = ycEyaya. cUycycycaycnycu , ycuycaycuycoycyycuycuyceycuycyc = . Here, ycUycycycaycnycu represents the training data, and ycUycyycayca denotes the PCA-transformed feature set with reduced Optimized feature selection Building upon the mathematical foundation of optimization techniques discussed earlier, this section outlines the specific approach used for feature optimization. The chosen methodology is tailored to enhance Int J Elec & Comp Eng. Vol. No. August 2025: 3759-3768 Int J Elec & Comp Eng ISSN: 2088-8708 the selection of relevant features. Each step is designed to ensure improved model performance and computational efficiency. Principal component analysis (PCA) Principal component analysis (PCA) is employed to reduce the dimensionality of the extracted deep features while preserving 95% of their original variance by setting ycuycaycuycoycyycuycuyceycuycyc to 0. This dimensionality reduction plays a key role in addressing the curse of dimensionality, which can negatively impact model accuracy and efficiency. Additionally, it significantly decreases computational load, enabling faster processing without a notable loss in model implementation. PCA is applied to reduce the dimensionality of the extracted deep features while retaining 95% of their variance . cuycaycuycoycyycuycuyceycuycyc =0. This reduction helps mitigate the curse of dimensionality and speeds up computation without significantly sacrificing model After PCA transformation . cUycycycaycnycu ycyycayca and ycUycyceycyc ycyycayca ), a random forest classifier . caycoycaycycycnyceycnyceycycyycayca ) is trained and evaluated on the reduced feature set to assess its classification accuracy . caycaycaycycycaycaycycyycayca ). Setting PCA . cuycaycuycoycyycuycuyceycuycyc =0. to retain 55% of variance is a strategic choice that balances dimensionality reduction with the preservation of essential information. By retaining 55% of the variance, the number of features is significantly reduced, which helps alleviate the curse of dimensionality and mitigate over fitting. This reduction simplifies the model and decreases computational complexity, while still maintaining enough variance to ensure that crucial information is preserved for accurate predictions. This approach effectively addresses the trade-off between reducing feature space and retaining significant data characteristics, thus optimizing the performance of the model. Also. Table 2 gives the performance of the model for different percentage of variance. Variance thresholding Variance thresholding . cOycaycycnycaycuycayceycNEaycyceycEaycuycoyc. is used to remove features with low variance. The threshold is set dynamically based on the variance of each feature . cEaycyceycEaycuycoycc = . 8 O . Oe 0. )). This technique is beneficial for eliminating features that do not vary much within the dataset, potentially reducing noise and improving model robustness. The random forest classifier . caycoycaycycycnyceycnyceyc_ycycay. is trained and evaluated on the selected features . cU_ycycycaycnycu_ycycayc and ycU_ycyceycyc_ycycay. , and its accuracy . caycaycaycycycaycayc_ycycay. is computed to compare with PCA. The use of ycOycaycycnycaycuycayceycNEaycyceycEaycuycoycc. cEaycyceycEaycuycoycc = . O . Oe 0. )) with a threshold of 0. r 4y0. is designed to filter out features with low variance, which are less likely to contribute meaningful information to the The threshold value of 0. 8 is chosen to remove features that have very low varianceAispecifically, those with variance less than 20% of the overall variance. This approach enhances model efficiency by focusing on more informative features while discarding those that contribute little to predictive performance. Table 3 provides the variation of the model performance for different variance threshold values. Table 2. Performance of the model for different variance during optimization Variance Original number of features: 25088 Reduced number of features after PCA: 90 Accuracy after PCA: 0. Reduced number of features after Variance Thresholding: 21016 Accuracy after Variance Thresholding: Number of features after preliminary PCA: 100 Reduced number of features after RFE: 50 Accuracy after RFE: 0. Table 3. Performance of the model for different threshold during optimization Threshold Number of images Original number of features Reduced number of features after PCA Accuracy after PCA Redatures after Variance Thresholding Accuracy after Variance Thresholding Number of features after preliminary PCA Reduced number of features after RFE Accuracy after RFE Accuracy without optimization Deep feature representation for automated plant species classification from leaf images (Nikhil Inamda. A ISSN: 2088-8708 Recursive feature elimination (RFE) RFE is applied after an initial PCA reduction (PCA . cu_ycaycuycoycyycuycuyceycuycyc = . ) to further select the most informative features . cu_yceyceycaycycycyceyc_ycycu_ycyceycoyceycayc = . RFE iteratively removes less important features based on their contribution to model accuracy, using a Random Forest estimator . cyceyce_yceycycycnycoycaycycuy. In this approach, a random forest classifier is employed to rank feature importance due to its capacity to manage large feature sets and generate reliable importance scores. The RFE process is configured to select the top 100 features, systematically removing 5 features at each iteration. Random forest classifiers are used to rank feature importance, leveraging its ability to handle large feature sets and provide robust importance scores. The RFE configuration aims to select 100 features, removing 5 features in at each iteration. This approach involves training the model on the full feature set, ranking features based on their importance, and then eliminating the least significant ones. This process is repeated until the desired number of features is reached, ensuring that only the most impactful features are retained for enhanced model efficiency and Table 4 provides the variation of the model performance for different estimator to reduced features values. Table 4. Performance of the model for different ratios of features to iteration Reduced features Original number of features: 25088 Reduced number of features after PCA: 43 Accuracy after PCA: 0. Reduced number of features after Variance Thresholding: 21016 Accuracy after Variance Thresholding: 0. Number of features after preliminary PCA: 150 Reduced number of features after RFE: 100 Accuracy after RFE: 0. 150/100 200/100 100/100 Classification This study evaluates four classifiers: random forest. K-NN. SVM, and nayve Bayes. Random forest, an ensemble method, constructs multiple decision trees, offering robustness and high accuracy with complex K-NN classifies data based on the nearest neighbors. it is simple but can be computationally intensive with large datasets. SVM is a powerful supervised algorithm that finds the optimal hyperplane for class separation in high-dimensional spaces but requires careful tuning. Nayve Bayes, a probabilistic classifier based on Bayes' theorem, assumes feature independence and is effective for specific tasks like text These classifiers were selected for their unique approaches and strengths, offering a comprehensive comparison of their performance. The performance of these models is depicted in Figure 3, with corresponding classification metrics for different CNN models provided in Table 5. Figure 3. Performance of classifiers for different CNN models Int J Elec & Comp Eng. Vol. No. August 2025: 3759-3768 Int J Elec & Comp Eng ISSN: 2088-8708 Table 5. Classification metrics for different CNN models CNN VGG16 RESNET50 INCEPTIONV3 XCEPTION DENSENET121 Classifier SVM K-NN SVM K-NN SVM K-NN SVM K-NN SVM K-NN Accuracy Precision Recall F1-Score RESULTS AND DISCUSSION The performance evaluation of various CNN models, including VGG16. ResNet50. InceptionV3. DenseNet121, and Xception, before and after applying feature optimization techniques, provides valuable insights into the effectiveness of these models and the impact of optimization on classification accuracy. Initially, models like ResNet50 and Xception performed well even without optimization, indicating their inherent capability to capture and represent intricate patterns in leaf images. However, feature optimization techniques such as PCA, variance thresholding, and RFE significantly enhanced the models' performance by reducing dimensionality, filtering out less informative features, and systematically removing less important This led to improved computational efficiency and accuracy, with optimized feature sets providing a more efficient foundation for future model development. Table 6 provides the details of the hyper parameters used to set the classifiers. The comparative performance of the models highlights the importance of feature optimization in achieving state-of-the-art results. Although some models achieved high accuracy without optimization, the application of PCA. Variance Thresholding, and RFE provided additional benefits in terms of efficiency and robustness. Figure 4 shows the confusion matrix and ROC curve for various CNN models with selected classifiers, demonstrating strong classification accuracy as detailed in Table 5. The confusion matrix and ROC curve analyses further illustrate the models' effectiveness in distinguishing between different plant species, with higher AUC values reflecting better class discrimination. This in-depth analysis underscores the significance of integrating advanced CNN architectures with feature optimization techniques, making the proposed model a valuable tool for precision agriculture, biodiversity conservation, and ecological monitoring. Comparison with existing methods The dataset under consideration consists of 20,357 images representing fourteen classes of plant A comparative analysis, highlights the performance differences between handcrafted and deep While deep features exhibit superior performance, they also present challenges, such as the need for substantial datasets and significant computational resources. Furthermore. Table 6 illustrates that this work surpasses conventional research methodologies, particularly in handling a larger number of classes. Table 6. Comparison with related work Sl no Reference no . Proposed Method VGG16 M-SVM DWT. COLOR HISTOGRAM SHUFFLENETV1 DEEP FEATURE LBP Deep features No of classes Accuracy Deep feature representation for automated plant species classification from leaf images (Nikhil Inamda. A ISSN: 2088-8708 VGG16 model with RF classifier DenseNet model with NB classifier InceptionV3 model with NB classifier Figure 4. Confusion matrix and ROC curve of CNN models CONCLUSION The study demonstrates the efficacy of deep feature representation in the automated classification of plant species using leaf images. By employing CNN models such as VGG16. ResNet50. DenseNet121. Inception, and Xception, we successfully captured high-level, discriminative features essential for accurate species differentiation. The application of optimization techniques like PCA. Variance Thresholding, and RFE further enhanced the efficiency of the feature set, leading to high classification accuracies when combined with classifiers such as SVM. K-NN. DT, and NB. The achieved results, with accuracies reaching up to 99. 7%, underscore the potential of this approach in advancing agricultural research, biodiversity conservation, and ecological monitoring. Future work will focus on expanding the dataset and exploring additional optimization strategies to further refine the classification process. Int J Elec & Comp Eng. Vol. No. August 2025: 3759-3768 Int J Elec & Comp Eng ISSN: 2088-8708 FUNDING INFORMATION This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. AUTHOR CONTRIBUTIONS STATEMENT Nikhil Inamdar conceived the study, designed the methodology, and performed the experiments. Manjunath Manguli contributed to data analysis, interpretation, and technical validation. Uttam Patil assisted with manuscript drafting, critical revisions, and final approval of the version to be published. All authors have read and agreed to the published version of the manuscript. Name of Author Nikhil Inamdar Manjunath Managuli Uttam Patil C : Conceptualization M : Methodology So : Software Va : Validation Fo : Formal analysis ue I : Investigation R : Resources D : Data Curation O : Writing - Original Draft E : Writing - Review & Editing ue Vi : Visualization Su : Supervision P : Project administration Fu : Funding acquisition CONFLICT OF INTEREST STATEMENT Authors declare that there is no conflict of interest DATA AVAILABILITY The data used in this study are publicly available and can be accessed from open sources as cited within the manuscript or can be contacted to author REFERENCES