297 Indonesian Journal of Science & Technology 10. 297-316 Indonesian Journal of Science & Technology Journal homepage: http://ejournal. edu/index. php/ijost/ Enhanced Product Defect Detection in Smart Manufacturing Using ConvNeXt-Stacked Autoencoder Architecture Ahmed M. Hasan1. Ahmed R. Nasser1. Ahmed Sabah Al-Araji1. Dalia Abdulkareem Shafiq2. Abdulkareem Sh. Mahdi Al-Obaidi2. Huthaifa Al-Khazraji1. Amjad J. Humaidi1,* University of Technology-Iraq. Baghdad. Iraq Taylor's University. Subang Jaya. Malaysia *Correspondence: E-mail: amjad. humaidi@uotechnology. ABSTRACT In the era of Industry 4. 0, ensuring product quality through accurate and efficient defect detection has become essential because traditional manual inspection methods are often time-consuming and prone to inconsistency. This study aims to enhance defect detection in smart manufacturing by proposing a hybrid deep learning architecture that combines ConvNeXt and stacked autoencoders. The method leverages ConvNeXt for robust feature extraction and stacked autoencoders for efficient data reconstruction and The model was trained and tested on realworld industrial image datasets involving damaged and intact packaging. Results demonstrate that the proposed method outperforms conventional convolutional neural networks in both detection accuracy and processing efficiency because of its ability to extract deep spatial and semantic features. This research contributes to the advancement of autonomous quality control systems in smart manufacturing. Its impact lies in reducing human dependency, improving inspection accuracy, and fostering the development of intelligent, self-regulating production A 2025 Tim Pengembang Jurnal UPI ARTICLE INFO Article History: Submitted/Received 25 Jan 2025 First Revised 27 Feb 2025 Accepted 14 Apr 2025 First Available Online 15 Apr 2025 Publication Date 01 Sep 2025 ____________________ Keyword: ConvNeXt. Convolutional neural network. Deep learning algorithms. Product defect detection. Quality control. Stacked autoencoders. Hasan et al. Enhanced Product Defect Detection in Smart Manufacturing Using A | 298 INTRODUCTION Advanced technologies such as information technology (IT). Internet of Things (IoT), artificial intelligence (AI), augmented reality (AR), virtual reality (VR), sensor networks, computerized controls, and big data drive manufacturing companies to work in a manner to satisfy the necessities of automatic manufacturing systems . Many reports regarding this matter have been well-developed (Table . Table 1. Previous studies on advanced technology to support industry 4. Topic Information Technology (IT) Internet of Thing (IoT) Artificial Intelligence (AI) Title Theoretical foundations of the creation of a curriculum in higher project IT in education Improving training of modern leaders utilizing IT in the administration of the higher education system The paradigm of curriculum differentiation in higher IT education Smart city and society 5. 0: Involvement of information technology in the development of public service systems in Indonesia Internet of things-based child stunting detection system for supporting sustainable development goals Predictive deep learning models to identify traumatic brain injuries using MRI data Enhancing digital literacy and teacher-preneurship through a critical pedagogy-based training platform Electric vehicle consumption dataset tailored to Malaysian situation and implemented using rapid miner auto-model Emerging applications of iot, machine learning, virtual reality, augmented reality and artificial intelligent in monitoring systems: a comprehensive review and analysis Mesh network based on MQTT broker for smart home and iiot factory Easy-mushroom mobile application using the Internet of Things . Greening the internet of things: A comprehensive review of sustainable IOT solutions from an educational perspective Internet of things-based child stunting detection system for supporting sustainable development goals Emerging applications of iot, machine learning, virtual reality, augmented reality and artificial intelligent in monitoring systems: a comprehensive review and analysis Emerging trends in technology: insights across machine learning, digitalization, and industry applications Internet of things for monitoring and optimisation of stand-alone systems in rural area: an experimental case Mobiledfu: classification of diabetic foot ulcer infection on the edge Chatbot artificial intelligence as educational tools in science and engineering education: A literature review and bibliometric mapping analysis with its advantages and disadvantages How bibliometric analysis using vosviewer based on artificial intelligence data . sing researchrabbit Dat. : Explore research trends in hydrology content Artificial intelligence (AI)-based learning media: Definition, bibliometric, classification, and issues for enhancing creative thinking in education Trends in the use of artificial intelligence (AI) technology in increasing physical activity Ref. DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 299 | Indonesian Journal of Science & Technology. Volume 10 Issue 2. September 2025 Hal 297-316 Table 1 . Previous studies on advanced technology to support industry 4. Topic Sensor Virtual Reality (VR) Title Bibliometric analysis of research trends in conceptual understanding and sustainability awareness through artificial intelligence (AI) and digital learning media The future of learning: ethical and philosophical implications of artificial intelligence (AI) integration in education University studentsAo awareness of, access to, and use of artificial intelligence for learning in Kwara State Bibliometric analysis on artificial intelligence research in Indonesia vocational education Primary education undergraduatesAo competency in the use of artificial intelligence for learning in Kwara State The digital frontier: AI-enabled transformations in higher education The role of chatgpt AI in student learning experience Monitoring of air quality with satellite-based sensor: The case of four towns in Southeast. Nigeria Vehicle-tracking mobile application without a GPS sensor Securing wireless sensor networks, types of attacks, and detection/prevention techniques: An educational perspective Graphene-based electrochemical sensors for heavy metal ions detection: A comprehensive review Internet of things-based child stunting detection system for supporting sustainable development goals Emerging trends in technology: insights across machine learning, digitalization, and industry applications Interference mitigation for dynamic user connectivity using sdn and radio resource management in cell-less networks Internet of things for monitoring and optimisation of stand-alone systems in rural area: an experimental case Mobiledfu: classification of diabetic foot ulcer infection on the edge Development of augmented reality application for exercise to promote health among elderly Application of augmented reality technology with the fuzzy logic method as an online physical education lecture method in the new normal era How to create augmented reality (AR) applications using unity and vuforia engine to teach basic algorithm concepts: Step-by-step procedure and bibliometric analysis Emerging applications of iot, machine learning, virtual reality, augmented reality and artificial intelligent in monitoring systems: a comprehensive review and analysis Emerging trends in technology: insights across machine learning, digitalization, and industry applications Let's immerse in metaverse: overview, challenges, and stumeta framework for successful implementation Development of science virtual laboratory . to develop critical thinking skills in elementary schools on the topic of changes in the state of substances Usability assessment of flipo-ar: navigating learning in a vuca world with augmented reality The use of virtual reality as a substitute for the pre-school studentsAo field trip activity during the learning from home period Ref. DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 Hasan et al. Enhanced Product Defect Detection in Smart Manufacturing Using A | 300 Table 1 . Previous studies on advanced technology to support industry 4. Topic Title Colleges of education lecturersAo attitude towards the use of virtual classrooms for instruction StudentsAo learning experiences and preference in performing science experiments using hands-on and virtual laboratory The effectiveness of using a virtual laboratory in distance learning on the measurement materials of the natural sciences of physics for junior high school students Perception of early childhood education lecturers on the use of virtual Lecturers perceived proficiency in the use of virtual classrooms for instruction in colleges of education Development and acceptability of virtual laboratory in learning Utilization of virtual reality chat as a means of learning communication in the field of education The effectiveness of using a virtual laboratory in distance learning on the measurement materials of the natural sciences of physics for junior high school students. Developing an identification and testing system for cardiac surgical instrument capabilities Emerging applications of iot, machine learning, virtual reality, augmented reality and artificial intelligent in monitoring systems: a comprehensive review and analysis Emerging trends in technology: insights across machine learning, digitalization, and industry applications Let's immerse in metaverse: overview, challenges, and stumeta framework for successful implementation Development of science virtual laboratory . to develop critical thinking skills in elementary schools on the topic of changes in the state of substances Ref. With the appearance of the Fourth Industrial Revolution (Industry 4. , the concept of product quality control has gained high attention in industrial manufacturing. One of the important research directions in the field of quality control, improving the detection of product defects during the manufacturing process. Delivering products without defects is always of major concern to top decision-makers in production companies. Faults and imperfections like internal holes, abrasions, and also scratches might occur in the production of products. In such cases, the quality of products will affect production efficiency, since products with poor quality considered a waste of raw materials and this is costs . Automatic defect-detection systems utilizing advanced technologies such as advanced sensors and artificial intelligence have noticeable advantages over the time-consuming traditional manual detection . Besides, the performance of human inspection gets very exhausted with repetitive tasks and hence these tasks are typically very labor-intensive. However, automatic detection of effective visual defects in the product, which intends to identify a possible defective area of a product image used and then categorize these images into defect and defect-free, appears as a solution to the problem . Artificial Intelligence (AI), in particular Machine Learning (ML) as well as Deep Learning (DL), can both be defined as algorithms consisting of multiple processing layers and can learn from data. In other words, algorithms take input data, train themselves to observe patterns DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 301 | Indonesian Journal of Science & Technology. Volume 10 Issue 2. September 2025 Hal 297-316 found in the data, and afterward predict the output for a new set of data . In terms of support in decision-making. ML and DL algorithms are revealing massive impending in the scrutiny of large amounts of data, currently readily available, aiming to enhance the efficiency of the proposed algorithm . With the significant enhancement of the Graphical Processing Units (GPU) computing aptitudes. DL approaches are mostly known to have remarkable rewards, in terms of the capability to conduct multivariate, high dimensional data and hence can extract concealed relationships within the specified data. Therefore, these algorithms appeared to be one of the most dominant tools in several applications and wide Aeld of research . Various research and studies have been performed on product defect detection in different industrial sectors such as semiconductor, carpet, steel, and fabric manufacturing based on DL algorithms. For instance, . presented an application of Deep Neural Network (DNN) collective with a high-resolution optical quality camera to boost the precision of an industrial visual assessment in the printing process. To show the role of DL algorithms in enabling production companies to transfer to smart manufacturing, . proposed a visual quality control system using DL methods. A camera was placed over the production line. Then, the image of the product is sent to the algorithm to decide whether the product is "okay" or "not okay". To improve the productivity of the textile industry, . proposed an EfficientDetD0 algorithm for a fabric defect detection system. Besides, . proposed a one-class classification (OCC) for carpet defect detection systems. The proposed system is trained only with normal samples whereas, throughout the test phase, both normal and defective images are employed. Three models are used. Convolutional autoencoder is first used as a hidden feature extractor. Therefore, the extorted feature vectors are then fed into the dimensionality reduction process by utilizing the method of principal component analysis (PCA). Finally, the training of the one-class classifier called Support Vector Data Description (SVDD) is performed using the above-mentioned resulting reduced-dimensional data. All these studies show the capability of DL algorithms to settle independently about product quality devoid of human participation. However, improving the efficiency and exploring the capability of other DL algorithms for better performance has been an ongoing Due to its higher accuracy and balanced precision and recall metrics as compared to other DL algorithms, in this paper, an improvement model of ConvNeXt with Stacked Autoencoders (ConvNeXt-SAE. is proposed for product defect detection systems. DEFECTS DETECTION SYSTEM Most recently defect-detection system is an interesting topic in many fields, among them It is widely used in production companies to ensure the quality of the product. Defects such as internal holes, abrasions, and even scratches might occur in the production of products. Therefore, a product defect detection system refers to the detection tools that are used to identify the external . and/or internal defects of products . Recently, automated defect-detection system utilizing integrated technologies such as image processing, pattern recognition, and DL algorithms has a broadly applied over manual detection to improve the performance of defect-detection systems . A product defect detection system based on DL algorithms is built upon the data that is collected from the production process. Then, by extracting patterns from the data. DL algorithms can be used to detect whether the product is "okay" or "not okay". This process works as a decision-making for quality-enhancing measures . Therefore, using an automated defect detection system can reduce costs and improve the efficiency of the production process. It also is considered one of the requirements for the transformation to the smart manufacturing industry . DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 Hasan et al. Enhanced Product Defect Detection in Smart Manufacturing Using A | 302 one of the important research directions in the field of improving the performance of the automated defect detection system, is improving the software and algorithms in image In this direction, a signiAcant amount of interest has been increased in proposing, exploring, and evaluating modern DL architectures. METHODS To satisfy the requirement of the transition to smart manufacturing systems, we proposed a DL algorithm named ConvNeXt Stacked Autoencoder for product defect detection systems. In the literature, the performance of the ConvNeXt and the Stacked Autoencoders networks has been examined separately. In this research, the advantages of the two methods are combined and applied to the defect-detection system. ConvNeXt Self-attention-based deep learning methods are approaches that have many application areas such as image and language processing, and and temporal prediction models. In these approaches, the inputs focus on calculating the relationships between different regions. Selfattention-based methods aim to emphasize the features that are significant for the target These methods consider self-attention matrices as the main components. Relationship calculations are performed with these matrices and the relationship calculations are associated with the weights in the feature maps . The self-attention-based method that can process images is the ConvNeXt model . ConvNeXt is a pure convolutional neural network (CNN) built entirely from standard convolutional modules combined with principles of transformer network . vision transformer and the swing transforme. The motivation behind ConvNeXt was to reexamine the design space of CNN network architecture and investigate the restrictions of what an untainted convolutional network can achieve. The researchers progressively modernized a standard ResNet towards an image transformer design and discovered numerous key components that contributed positively to the performance difference along the system. Therefore, the result of this discovery is a kind of pure convolutional model called ConvNeXt . Built totally from standard convolutional modules. ConvNeXt competes positively with transformer-based models in COCO detection along with ADE20K segmentation while maintaining the simplicity and efficiency of standard convolutional networks. The ConvNeXt model has approximately the same number of parameters and memory usage rate as the reference methods, but certain modules have been simplified with this The ConvNeXt model, whose layer principle is shown in Figure 1, first processes the input with a convolution layer. Then, a linear normalization process is applied to the features created by these convolution operations. With linear normalization, the convolution layer outputs are brought to a certain range. The last function of the ConvNeXt block is the activation of the Gaussian Error Linear Unit (GeLU) . This activation process compresses the outputs and completes the transformation process. The layers and mathematical formulations of ConvNeXt are described as follows. DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 303 | Indonesian Journal of Science & Technology. Volume 10 Issue 2. September 2025 Hal 297-316 Figure 1. ConvNeXt block diagram. Convolution: ConvNeXt relies on the basic convolution operation for image processing, where each output is computed as a weighted sum of neighboring points . ee equation . Yi,j=OcxUIw . Layer Normalization (LN): Layer normalization is used to stabilize the training process by normalizing the data across different feature channels . ee equation . cu ) = 2 ycuOeyc Ooyua2 Oeyun yu Oe yu where yc, yu , yu and yua are the values learned during training. Deepwise Convolution: In this process, a separate convolution is applied to each channel, which reduces the computational volume . ee equation . yca ycycn,yc = Oc ycu. yc yca . where yca denotes the channel number. Pointwise Convolution: A 1x1 convolution that combines information from different channels . ee equation . ycycn,yc = Oc ycu yca . yc yca . Stochastic Depth: During training, some remaining connections are randomly skipped with probability ycy. Equation . is in the following: ycIyceycycaycoycuycayco. cu ) ycu ycycnycEa ycy yc={ ycu ycycnycEa . Oe yc. DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 Hasan et al. Enhanced Product Defect Detection in Smart Manufacturing Using A | 304 . GELU activation function: It is a smooth activation function that is better than ReLU . ee equation . GELU. =xUI. is the cumulative distribution function of Gauss. ConvNeXt combines these operations to enhance efficiency and performance, by blending traditional neural network techniques with some approaches inspired by the modern Transformers architecture. Stacked Autoencoders Stacked autoencoders (SAE. are one of the artificial neural network models that have an important place in the deep learning literature . This model is generally used to provide a hierarchical feature extraction for the representation of input data. It consists of a set of autoencoders, each of which is trained separately and then combined. Autoencoders perform a learned encoding operation to represent the data in a lower dimensional space and then aim to reconstruct this representation from the original data. Simple autoencoder architecture is shown in Figure 2 . Autoencoder neural network essentially consists of two elements: an encoder and a decoder. Moreover, it renovates the specified input data into features using the encoder and hence reconstructs the input data by renovating these features back to its raw data throughout the decoder . Multiple autoencoders are related to each other to form a stacked autoencoder architecture. A stacked autoencoder allows learning higher-level features, each derived from the previous one. This model generally provides effective results on large and complex data sets and is widely used in feature extraction, dimensionality reduction, and classification operations . It contains many hyperparameters in a stacked auto-encoder structure. These hyperparameters directly affect the performance of the established network. High success in a created architecture can be achieved by optimizing these parameters specific to the problem. Figure 2. Auto-encoder architecture. DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 305 | Indonesian Journal of Science & Technology. Volume 10 Issue 2. September 2025 Hal 297-316 Stacked Autoencoders are a type of neural network used to learn hidden representations of data. Stacked autoencoders rely on stacking several layers of autoencoders to arrive at more complex representations, with each layer trained sequentially. The basic structure and mathematical equations for the stacked autoencoder network are described as follows: Encoding stage. The input x in the first layer is transformed by an activation function to generate the first hidden representation . ee equation . Ea. = yce. c 1 . ycu yca1 ) . where yc 1 is the specified weight matrix related to the first layer, yca1 is the shift vector . , and yce is the activation function such as a ReLU or Sigmoid function. Decoding. The hidden representation Ea . is passed through the next layer to reconstruct the original input . ee equation . ) ycuC = yci. c 2 . Ea. ) . where yc 2 is the specified weight matrix for the second layer, yca . is the bias vector for the second layer, and yci is the specified activation function used in decoding. Cost Function. The difference between the original input ycu and the reconstructed input ycuC is measured using a cost function, such as the Mean Squared Error . ee equation . , ycuC) = ycu Oc. cuycn Oe ycuCycn )2 . where ycu is the number of data samples. Stacked Layer Training. When training a stacked autoencoder, several layers are stacked so that the hidden layer is used as the input to the next layer, so the autoencoder is stacked as follows . ee equation . Ea. = yce. c 3 . Ea1 yca3 ) . Then the encoder is decoded again to obtain the reconstructed input, and this process is repeated with each added layer. Proposed ConvNeXt Stacked Autoencoder A ConvNeXt Stacked Autoencoder (SAE) is an advanced neural network that integrates ConvNeXt, a modernized convolutional architecture, with a stacked autoencoder . ConvNeXt acts as the encoder, leveraging its convolutional layers to extract hierarchical, high-quality features from images, benefiting from modern techniques like large kernel sizes and layer The stacked autoencoder architecture allows for unsupervised learning, reducing dimensionality by compressing the input data into a latent space and then reconstructing it. By attaching a classifier after encoding, this architecture excels in image classification, especially where feature extraction and representation learning are critical. This combination enables improved accuracy and efficiency in image classification tasks compared to traditional CNN autoencoders, owing to ConvNeXt's improved feature extraction The block diagram of the proposed method is shown in Figure 3. The steps of the ConvNeXt-SAEs algorithm are described in Algorithm 1 in Table 2. DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 Hasan et al. Enhanced Product Defect Detection in Smart Manufacturing Using A | 306 Figure 3. ConvNeXt-SAEs block diagram. Table 2. Algorithm 1 relating to ConvNeXt-SAEs algorithm steps. Algorithm 1. ConvNeXt-SAEs algorithm steps Data preparation: A Take a set of images {X} = . 1, x2, . , x. , and each image xi represents a set of pixels in the two dimensions . , where: h: height of the image, w: width of the image and c: number of channels . 3 for RGB color channel. Create the model: A Inputs: The input image . is represented in the dimension . Encoder: A Convolution layer: The convolutional operation is applied to the image using a filter W: ycycn = ycyeO O yeo yeEyeO where ycycn is output of convolution, x input image and bi is the bias. A Activation: which is a non-linear activation function is applied to the output of the specified convolutional layer, such as the ReLU function: ycaycn = max . , ycycn ) A Pooling: Next, we use Max Pooling to reduce the dimensionality: ycyycn = max . caycn ) The output size is reduced by taking the maximum value in each window. Bottleneck: A The data is transferred to a fully connected layer. If the previous output has dimension n while the fully connected layer has m nodes, this is mathematically represented as follows: Ea = yua. cOEa ycy ycaEa ) where h is the Compact representation of features. W Fully connected layer weight and E is the Activation function, such as ReLU. Decoder: A Transposed Convolution Layer: To expand the dimensions again, we use UpSampling or DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 307 | Indonesian Journal of Science & Technology. Volume 10 Issue 2. September 2025 Hal 297-316 Table 2 . Algorithm 1 relating to ConvNeXt-SAEs algorithm steps. Algorithm 1. ConvNeXt-SAEs algorithm steps A Transposed Convolution. The equation here is similar to the convolution layer but in reverse: ycycn = ycOycnycN O Ea ycaycn Activation: The activation function . uch as ReLU) is implemented in the same way: ycaycn = max . , ycycn ) A Upsampling: To expand the image again, the dimensions are increased using expansion ycycn = yayayayaoyayayauya. caycn ) Output Layer: A A final convolutional layer to generate the reconstructed image ycuC ycuC = yua. cOycu O yc ycaycu ) Classification Head: A Global Average Pooling: The encoder output is passed to the Global Average Pooling layer which collects information from each channel: yci = Ocycuycn=1(Eaycn ) A ycu Classification layer: The result is passed to a fully connected layer with k nodes . here k is the number of classe. yc = ycycuyceycycoycaycu. cOyc g ycayc ) Training: A Use a loss function such as binary cross-entropy. yco ya. cC, y. = Oe Oc ycycn log . cCycn ) ycn=1 Evaluation: A Calculate the accuracy of the model based on the comparison among the predicted results y and the final actual results. RESULTS AND DISCUSSION The series of simulation results that we carried out reflect in part the major aim of the proposed ConvNeXt Stacked Autoencoders (ConvNeXt-SAE. architecture with published data from real manufacturing lines in the Kaggle website. The industrial quality control of packages dataset contains information about package dimensions, weights, and defect indicators. It is used to train machine learning models to predict defects, detect anomalies, or improve quality control processes. This dataset is useful in industrial and academic settings to analyze and improve packaging standards through data analysis. The data includes images of packages from a real manufacturing line. The data is categorized into two types: damaged boxes, which contain 200 image files of only damaged boxes. and intact boxes, which contain 200 image files of only intact boxes. The data is split into 70% for training and 30% for testing and Before training the proposed method preprocessing image augmentation techniques are used to generate new images from the existing ones hence increasing the used data diversity. Figure 4 illustrates the types and the number of layers used in the proposed ConvNeXt-SAEs architecture for product defect detection. The model hyperparameters are summarized in Table 3. The proposed ConvNeXt-SAEs model evaluation results in terms of accuracy and model loss which are shown in Figure 5. To validate the effectiveness of the proposed ConvNeXt-SAEs model for product defect detection a comparison with conventional ConvNeXt and the traditional CNN models is Both ConvNeXt and CNN methods are trained and evaluated with the same DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 Hasan et al. Enhanced Product Defect Detection in Smart Manufacturing Using A | 308 dataset used for evaluating the ConvNeXt-SAEs model with the same parameter settings. The evaluation results of the ConvNeXt and CNN models in terms of accuracy and loss are shown in Figures 6 and 7, respectively. The obtained results for the proposed ConvNeXt-SAEs. ConvNeXt, and CNN models are summarized in Table 3. Figure 4. The proposed ConvNeXt Stacked Autoencoder architecture. DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 309 | Indonesian Journal of Science & Technology. Volume 10 Issue 2. September 2025 Hal 297-316 Table 3. Model hyperparameters. Parameter Activation function Learning Rate Loss function Epoch Batch size Optimizer Value GeLU /softmax categorical_crossentropy ADAM Figure 5. ConvNeXt-SAEs model results. Figure 6. ConvNeXt model results. Figure 7. CNN model results. DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 Hasan et al. Enhanced Product Defect Detection in Smart Manufacturing Using A | 310 Based on the obtained results, the ConvNeXt achieves higher accuracy by 21% improvement compared to traditional convolutional CNN, thanks to architectural improvements inspired by Vision Transformers. These include using larger convolutional kernels . to capture more spatial information and replacing the Batch Normalization layer with Layer Normalization to improve model stability. It also simplifies the architecture and reduces complexity, which reduces overfitting and increases computational efficiency using deep convolutions. These improvements allow ConvNeXt to outperform CNNs on tasks such as image classification. Moreover, based on the obtained results in Table 4, the ConvNeXtSAEs are more effective than ConvNeXt by 7% because of their ability to better extract complex features from data. By using multiple layers of encoder and decoder, the model can delve deeper into understanding graph representations, which helps improve accuracy in image classification or other tasks. It can also teach more complex pattern recognition to the hierarchical structure it provides. These features make ConvNeXt-SAEs a preferred choice when it comes to achieving higher accuracy in tasks that require deep data analysis. Table 4. ModelAos results summary. Method ConvNeXt-SAEs ConvNeXt CNN F-Score CONCLUSION The recent advancement in the DL methods offers ways for industrial companies to meet the requirements of smart manufacturing systems due to their ability to classify and analyze the features of the input data. The use of DL algorithms in manufacturing lines has greatly decreased human involvement in production for product defect detection. The detection approach based on DL algorithms is appropriate to diverse objects and many defect types as long as it is trained well depending on the corresponding data. In this context, this paper investigates the feasibility of using DL algorithms in automated defect detection systems. The ConvNeXt Stacked Autoencoders (ConvNeXt-SAE. DL algorithm is proposed. ConvNeXt is well-known as the most robust model that can balance between precision and recall. On the other hand. SAE can perform effective classiAcation by analyzing the features of the input The proposed ConvNeXt-SAEs DL algorithm combines the core concepts of these two DL types of architecture into a single hybrid model applied to the defect-detection system. Published data from real manufacturing lines on the Kaggle website is used for evaluation. Simulation results show that the ConvNeXt-SAEs algorithm improves the accuracy by 7% and 40% as compared with the conventional ConvNeXt and CNN algorithms, respectively. Based on the obtained results, the proposed ConvNeXt-SAEs DL algorithm proves significant potential and advantages to be used in real-time industrial applications for defect detection. AUTHORSAo NOTE The authors declare that there is no conflict of interest regarding the publication of this The authors confirmed that the paper was free of plagiarism. DOI: https://doi. org/10. 17509/ijost. p- ISSN 2528-1410 e- ISSN 2527-8045 311 | Indonesian Journal of Science & Technology. Volume 10 Issue 2. September 2025 Hal 297-316 REFERENCES