Indonesian Journal of Electrical Engineering and Informatics (IJEEI) Vol. No. September 2025, pp. ISSN: 2089-3272. DOI: 10. 52549/ijeei. Efficient Corrosion Detection on Metal Surface Using Deep Learning Technique Phan Nguyen Ky Phuc1. Tran Lam Trung Tin2. Trong Hieu Luu3* 1,2School of Industrial Engineering and Management. International University. Vietnam National University Ae Ho Chi Minh City. Vietnam. 3College of Engineering. Can Tho University. Can Tho City. Viet Nam. Article Info ABSTRACT Article history: This study examines how deep learning models can improve corrosion detection, comparing YOLOv7 with its more advanced version. YOLOv8. Both models were trained on a diverse set of images showing different types and levels of corrosion on metal surfaces. Their performance was assessed using standard industry metrics, including accuracy. F1-score, recall, and mean average precision . AP). The results clearly show that YOLOv8 outperforms YOLOv7 in all areas. It achieves higher recall, precision, and F1score, demonstrating its improved ability to detect and classify corroded areas. Notably. YOLOv8 is better at identifying small or early-stage corrosion, which is crucial for timely maintenance. Additionally, it processes images faster than YOLOv7, making it more suitable for real-time applications. This study also suggests integrating YOLOv8 with robotic arms equipped for laser cleaning, allowing for automated and precise corrosion removal. This system could improve maintenance efficiency, reduce costs, and enhance the safety and reliability of infrastructure. Received Mar 3, 2025 Revised Sep 5, 2025 Accepted Sep 28, 2025 Keywords: Metal surface Corrosion detection YOLOv7 YOLOv8 Copyright A 2025 Institute of Advanced Engineering and Science. All rights reserved. Corresponding Author: Trong Hieu Luu College of Engineering. Can Tho University, 3/2 Street. Xuan Khanh Ward. Ninh Kieu District. Can Tho City, 910900. Viet Nam. Email: luutronghieu@ctu. INTRODUCTION Corrosion, a widespread problem in various sectors, presents a substantial risk to the structural soundness and durability of metal assets. Chemical reactions between metals and their environment result in the degradation of metals, causing significant economic losses and safety hazards. Conventional inspection techniques, which frequently depend on human effort, are inefficient, expensive, and susceptible to mistakes made by humans. The devastating failure of the Genoa Bridge in Italy in 2018, which was partly caused by undiscovered corrosion, highlights the urgent requirement for sophisticated, automated, and dependable corrosion detection systems. The difficulty at hand can now be tackled thanks to the latest progress in computer vision and deep learning. Applying these technologies to automate corrosion detection has the potential to completely transform inspection operations. This would allow for early detection, precise evaluation, and proactive maintenance of vital infrastructure. By harnessing the capabilities of artificial intelligence to interpret visual data, we can improve the safety, effectiveness, and cost-efficiency of corrosion management systems. The evolution of deep learning has led to significant advancements in methods and architectures for practical training and abstraction of hierarchical representations from multi-dimensional data, overcoming the limitations of traditional shallow networks . Deep learning's remarkable efficacy in addressing diverse autonomous tasks such as perception, planning, localization, and control makes it a promising approach for various autonomous robotic applications . Several research studies have explored the application of deep learning for corrosion detection in multiple domains. For instance. Shirsath et al. presented a deep learningJournal homepage: http://section. com/index. php/IJEEI/index IJEEI ISSN: 2089-3272 based approach for automated corrosion detection in industrial environments. The study utilized a convolutional neural network (CNN) to analyze images of metal surfaces and identify regions affected by different types of corrosion, including pitting, scaling, and uniform corrosion. The results demonstrated the CNN's ability to detect corrosion with high precision and recall accurately. Malashin et al. proposed a deep learning-based system for real-time corrosion detection in the context of pipeline inspection. This system integrated image processing and deep learning techniques to analyze video footage of pipeline interiors. CNN was trained to identify corrosion patterns in the video frames, enabling real-time corrosion detection and proactive maintenance. Beyond metals, deep learning has also shown promise in detecting corrosion in other Munawar et al. investigated deep learning for detecting corrosion in concrete structures. The study proposed a novel deep-learning architecture that effectively segmented and classified different types of corrosion in concrete images, outperforming traditional image processing techniques. One of the challenges in training deep learning models for corrosion detection is the limited availability of large, high-quality datasets. Canonaco et al. explored transfer learning to address this issue. By fine-tuning pre-trained CNNs on smaller datasets of corrosion images, the study demonstrated that transfer learning could significantly improve the accuracy and efficiency of corrosion detection, especially in scenarios with limited labeled data. Furthermore. Zuben et al. investigated using Generative Adversarial Networks (GAN. for data augmentation in corrosion detection. GANs were employed to generate synthetic images of corroded surfaces, enriching the training data and improving the robustness and generalization of deep learning models. The results showed that GAN-based data augmentation can significantly enhance the accuracy of corrosion detection. These studies demonstrate the potential of deep learning for revolutionizing corrosion detection. By leveraging the power of deep learning, it is possible to develop automated and efficient inspection systems that can significantly improve safety, reduce maintenance costs, and enhance the lifespan of critical infrastructure. Table 1 provides a summary of the related work, including the methods used, dataset sizes, and results. Table 1. Summary of the literature review Study Technique Dataset size Result Resnet50 Accuracy 4% . Custom CNN Accuracy 98. CycleGAN Accuracy 92. CNN, transfer Accuracy 69% . RL-GAN. RT-GAN CNN model based for Alexnet Accuracy 98. Comparison of FCN. U-Net, and Mask RCNN. DenseNet Accuracy 92. Residual Unet Accuracy 94% . Deep CNN. CycleGAN. U-Net. Accuracy 98. Mask R CNN Precision 85. YoloV3 Precision 82. Generalization & Mean square error 0. MobileNetV1 SSD Accuracy 84. DSNet Precision 94. Laser cleaning offers a promising approach to removing corrosion from various surfaces in corrosion The intense energy can cause the corrosion layer to rapidly heat up and detach from the underlying material by precisely focusing a laser beam onto the corroded area. This process, often called "laser ablation," can be highly effective in removing a wide range of corrosion types, including rust, oxides, and sulfides, without causing significant damage to the base material. Laser cleaning is particularly advantageous due to its non-contact nature, eliminating the risk of surface damage from abrasive tools. Additionally, it can be automated for efficient and consistent cleaning, making it suitable for large-scale applications. This study aims Efficient Corrosion Detection on Metal Surface UsingA (Phan Nguyen Ky Phuc et a. A ISSN: 2089-3272 to develop a real-time corrosion detection system directly integrated with robotic arms to perform laser cleaning tasks for corrosion removal. The YOLO (You Only Look Onc. family of object detection models has gained popularity for real-time detection tasks due to its speed, accuracy, and flexibility. These models are designed to be ideal for applications such as drone inspections and video-based monitoring of infrastructure . Recent versions, such as YOLOv5 . and YOLOv7 . , have shown impressive accuracy in identifying defects like cracks, rust . , and pitting. YOLOAos architecture can also be customized for specific corrosion types and environments, making it effective in various conditions. This improved model effectively identifies defects across multiple categories and scales, offering high accuracy and real-time performance . Another modification of the YOLO model integrates multi-scale exploration blocks and spatial attention mechanisms, enabling precise detection of steel surface defects while meeting real-time processing requirements . The YOLO model . is known for its fast detection speed and high accuracy, delivering excellent performance in various applications . However, due to the uneven scale distribution and irregular patterns of surface defects on castings, relying solely on the YOLOv5 algorithm may not yield optimal results. Research into YOLOAos application for corrosion detection has demonstrated that newer versions, like YOLOv7 and YOLOv8, perform better than earlier ones due to improved designs and training methods . Techniques like transfer learning, where pre-trained models are fine-tuned with smaller datasets, have proven effective, reducing the need for large, labeled datasets, a common challenge in deep learning . Moreover, combining YOLO models with computer vision techniques, such as noise reduction and contrast enhancement, can improve detection accuracy by enhancing the quality of input images . Due to their prominent advantages, this study adopted YOLOv7 and YOLOv8 models as the backbones when building the proposed system for accurately and efficiently detecting corrosion in real-world scenarios. This study aims to utilize the capabilities of the You Only Look Once (YOLO) object identification algorithm, specifically its most recent versions. YOLOv7 and YOLOv8, to detect corrosion on metal surfaces. By conducting a comparison examination of various models, our objective is to determine the best appropriate strategy for precise and efficient identification of corrosion in real-world situations. RESEARCH METHOD YOLOv7 architecture YOLOv7 . , launched in July 2022, is an important step forward for the YOLO (You Only Look Onc. series of object detection models. It improves upon earlier versions, achieving better accuracy and speed for real-time object detection Figure. Figure 1. The overview architecture of YOLO-V7 based on . As shown in Figure. YOLOv7Aos structure comprises three main components, namely Backbone. Neck, and Head. Backbone is responsible for extracting features from the input image at different levels. First, the image is processed through ConvModules in the Stem layer, which gradually reduces its size. Then, it passes through four ELAN (Efficient Layer Aggregation Networ. blocks, which decrease the imageAos spatial size but increase the number of feature channels. The Neck takes feature maps from the Backbone and combines them using operations like ConCat. Upsample, and ConvModules. This creates feature pyramids that help detect objects of different sizes. YOLOv7 uses SPPFCSP blocks to efficiently expand the receptive field and merge features. Finally, the Head has three branches, each working with a level of the feature pyramid IJEEI. Vol. No. September 2025: 758 Ae 768 IJEEI ISSN: 2089-3272 from the Neck. Each branch includes ImplicitA and ImplicitM modules, a ConvModule, and a Loss function. The ImplicitA and ImplicitM modules improve feature learning, while the ConvModule makes the final predictions, including bounding boxes, objectness scores, and class probabilities. These are compared to ground truth using the Loss function. YOLOv8 architecture YOLOv8 . , released by Ultralytics in January 2023 with the architecture in Figure 2 below, is a major advancement in real-time object detection. It features a CSPDarknet53 backbone with cross-stage partial connections, enhancing information flow and improving detection accuracy. The neck, which extracts features, uses a C2F module instead of the traditional Feature Pyramid Network (FPN). This module combines highlevel semantics and low-level spatial details, making it especially effective for detecting small objects. Figure 2. The YOLOv8 architecture based on . The head generates predictions for bounding boxes, objectness scores, and class probabilities, combining them for accurate detection and classification. YOLOv8 employs an anchor-free design, simplifying tasks like classification and regression. The output layer uses sigmoid activation for objectness scores and SoftMax for class probabilities. YOLOv8-Seg, its semantic segmentation variant, incorporates segmentation heads after the C2f-based neck. These heads predict segmentation masks for input images. The model excels in tasks requiring fast and accurate segmentation and object detection. Key improvements in YOLOv8 include advanced training methods like Rectified Adam (Ada. , which speeds up training and boosts detection performance. This ensures efficient training with accurate results in identifying objects. The YOLOv8 model also includes tools for data labeling, model training, and deployment. These features make YOLOv8 versatile and suitable for various real-world applications. Model evaluation and performance metrics The Intersection over Union (IoU) is a key metric used in computer vision to assess the performance and accuracy of object detection algorithms. It measures how well a predicted object aligns with the actual object annotation. A higher IoU score implies a more accurate prediction. Precision and recall computation involves generating a precision and recall plot at different IOU thresholds to determine if the forecasts are true positives or false positives. Furthermore, the precision and sensitivity of a prediction are determined by the IOU threshold. The IOU, or Intersection over Union, is a measure of the overlap between the ground truth and prediction labels, expressed as a ratio. By employing the F1 score and mAP when evaluating models, researchers and practitioners can make informed decisions about the efficacy of their models. This approach provides a thorough assessment of the model's performance. The formulas for computing precision, recall. F1 score, and mAP are given in equations . , where the term. represents the precision as a function of recall and the mean Average Precision . coyaycE) is found by averaging the Average Precision . aycE) values across all categories. Efficient Corrosion Detection on Metal Surface UsingA (Phan Nguyen Ky Phuc et a. A ISSN: 2089-3272 ycNycycyce ycEycuycycnycycnycyce ycNycycyce ycEycuycycnycycnycyce yaycaycoycyce ycEycuycycnycycnycyce . ycNycycyce ycEycuycycnycycnycyce ycNycycyce ycEycuycycnycycnycyce yaycaycoycyce ycAyceyciycaycycnycyce . ycEycyceycaycnycycnycuycu = ycIyceycaycaycoyco = ya1 = 2 y ycEycyceycaycnycycnycuycu y ycIyceycaycaycoyco ycEycyceycaycnycycnycuycu ycIyceycaycaycoyco yaycE = O ycy. ycoyaycE OcycA ycn yaycEycn ycA Experimental setup The dataset, which comprises 1,253 images, was synthesized from multiple sources on Roboflow . For our experiments, the data was stratified and partitioned into training, validation, and testing subsets with a 70%, 15%, and 15% distribution, respectively. The specific experimental configurations are outlined in Tables 2 and 3: Table 2. Graph representations Parameters YOLOv8 YOLOv7 Size Image Batch Size Optimizer AdamW SGD Learning Rate Momentum Weight Decay Epochs Patience 100 epochs 100 epochs Unspecified Parameters Defaults Defaults Table 3. Experimental setup Component Name or Value Training Environment Google Colab Programming Language Python GPU T4 GPU . GB RAM) RAM 7 GB Disk Space 200 GB Framework Pytorch Libraries OpenCV. CUDA Component Name or Value Training Environment Google Colab RESULTS AND DISCUSSION Model performance Figure. 4 presents a comparative performance analysis of YOLOv7 and YOLOv8 object detection Detailed numerical results are summarized in Table 4. The evaluation is based on four key performance metrics: recall, precision. F1-score, and mean Average Precision . AP). IJEEI. Vol. No. September 2025: 758 Ae 768 IJEEI ISSN: 2089-3272 Figure 4. Performance metric diagrams of YOLOv7 and YOLOv8 models Table 4. Comparison of YOLOv7 and YOLOv8 models parameters Models Recall Precision F1 score YOLOv7 YOLOv8 A detailed analysis reveals that YOLOv8 outperforms YOLOv7 in most performance metrics. YOLOv8 demonstrates a higher recall . 92 vs. , indicating its ability to detect more relevant objects within the dataset. Furthermore. YOLOv8 exhibits significantly better precision . 99 vs. , resulting in fewer false positives and improved accuracy. Consequently. YOLOv8 achieves a higher F1 score . 96 vs. , which effectively balances recall and precision. Although YOLOv8 exhibits a slightly lower mean Average Precision . AP) of 0. 93 compared to YOLOv7's 0. 94, the difference is negligible. Based on the data in Table 4. YOLOv8 emerges as the superior model overall, excelling in critical metrics such as recall, precision, and F1 score, thereby enhancing its accuracy and efficiency for object detection tasks. Discussion In the detection stage, a pre-trained YOLO model analyzes new images to identify corrosion. The image is first preprocessed and then fed into the model, which extracts features and predicts potential corrosion Non-maximum suppression and thresholding refine these predictions by eliminating redundant detections and filtering out those with low confidence scores. Finally, bounding boxes are overlaid on the original image, highlighting the identified corrosion regions. Figure 5 compares the corrosion detection capabilities of YOLOv7 and YOLOv8. Each row presents an original image alongside the detection results from both models, with the detected corrosion areas For both models, the confidence threshold is set at 0. Efficient Corrosion Detection on Metal Surface UsingA (Phan Nguyen Ky Phuc et a. A ISSN: 2089-3272 YOLOv8 consistently demonstrates superior precision in identifying corrosion. Its bounding boxes more accurately align with the actual edges of the corroded areas, minimizing the inclusion of unnecessary This is particularly evident when detecting intricate patterns, such as those found in chain links or peeling paint on metal. In contrast. YOLOv7 produces less accurate bounding boxes. Furthermore. YOLOv8 excels at detecting minor corrosion. For example, in an image with a predominantly red corroded surface. YOLOv8 successfully identifies small corrosion patches, whereas YOLOv7 tends to focus on larger, more conspicuous areas. As shown in Table 5. YOLOv8 also exhibits faster processing speed, completing the task in 0. 2 seconds, nearly twice as fast as YOLOv7, which requires 0. Based on both the analysis of processing time and the visual comparison of results. YOLOv8 clearly outperforms YOLOv7 in corrosion detection. This performance enhancement is likely attributed to improvements in YOLOv8's design or underlying algorithms. The data presented in Table 5, along with the visual evidence, corroborates this conclusion. Moreover. YOLOv8's faster processing time makes it an ideal choice for real-time applications where rapid and accurate corrosion detection is paramount. Additionally. Figure 5 provides a visual comparison of the object detection capabilities of YOLOv7 and YOLOv8. Table 5. Comparison of processing time for both models Model Processing Time Number of parameters. YOLOv7 7 seconds YOLOv8 2 seconds Original YOLOv7 IJEEI. Vol. No. September 2025: 758 Ae 768 YOLOv8 IJEEI ISSN: 2089-3272 Efficient Corrosion Detection on Metal Surface UsingA (Phan Nguyen Ky Phuc et a. A ISSN: 2089-3272 Figure 5. Corrosion detection using YOLOv7 and YOLOv8 CONCLUSION This study demonstrated the superiority of YOLOv8 over YOLOv7 and other models in accurately and efficiently detecting corrosion in real-world images. YOLOv8 outperformed YOLOv7 not only in key performance metrics such as recall, precision, and F1-score but also demonstrated superior performance in detecting small corrosion areas and exhibited faster inference times. The smaller the inference time, the more suitable it becomes for integration into robotic arm systems as a visual feedback component. Furthermore, the implementation of corrosion detection systems can revolutionize inspection practices in sectors relying on metallic infrastructure, enhancing safety by enabling proactive maintenance and mitigating catastrophic failures, saving costs by minimizing downtime and reducing manual labor, and increasing efficiency by allowing for more frequent and comprehensive assessments. However, there are some limitations that can be further improved in future research, such as incorporating mechanisms to improve the detection of small and subtle corrosion defects as well as utilizing surrounding environmental and material information to enhance the model's understanding of corrosion patterns. In addition, it also needs to pay special attention to rare and difficult-to-detect corrosion types, which make it very difficult to obtain data to improve overall detection Conflicts of Interest The authors declare no conflict of interest. REFERENCES