Classification Analysis of Back Propagation-Optimized CNN Performance in Image Processing Putrama Alkhairi1.
Agus Perdana Windarto2* 1,2Information Systems.
STIKOM Tunas Bangsa.
Pematang Siantar.
Indonesia Email: putrama@amiktunasbangsa.
id1, 2*agus.
perdana@amiktunasbangsa.
Abstract This study aims to optimize the performance of the Convolutional Neural Network (CNN) in the image classification task by applying data augmentation and fine-tuning techniques to a case study of mammal classification.
In this study, we took a fairly complex image classification dataset and used the CNN model as a basis for training and evaluating the performance of the model compared to Back propagation.
From this study, the CNN VGG16 architecture optimized with ADAM optimization has been compared with the Back propagation optimization of SGD.
We also conducted a literature review on several related studies and basic concepts in CNN, such as convolution, pooling, and fully connected layers.
The research methodology involves creating datasets using data augmentation techniques, model training using fine-tuning techniques, and testing model performance using a number of evaluation metrics, including accuracy, precision, and recall.
The results of this study indicate that the techniques used have succeeded in improving the performance of the CNN model in complex image classification tasks with accuracy in identifying and monitoring animal species more accurately, with an accuracy of 91.
for the best model.
Model accuracy increased by 2% after applying data augmentation and fine-tuning techniques to the CNN These results indicate that the techniques applied in this study can be a good alternative in improving the performance of the CNN model in the image classification task.
Keywords: CNN.
VGG16 Architecture.
Back propagation.
Optimization.
Classification.
Animals Introduction Image processing has become an increasingly important research topic in recent years.
However, the use of the neural convulsion model (CNN) for image classification often faces challenges, such as image blur or intensity variations.
Convolutional neural networks (CNN.
have been widely used for classification and image Convolutional Neural Network (CNN) is a neural network model that is very effective in image processing and pattern recognition .
, .
CNN consists of several layers, each of which has a specific function in image processing, such as convolution, pooling, and fully-connected layers.
For better accuracy.
CNN requires both intensive and extensive computations.
For real-time processing.
CNN is usually accelerated by a parallel processor such as a graphics processing unit (GPU).
Although GPUs speed up computing, the substantial increase in power limits their applicability to embedded systems.
For low power and high performance digital systems, field programmable gate arrays (FPGA) and application specific integrated circuits (ASIC) have been used for CNN accelerators in recent years .
, .
The increasing use of Convolutional Neural Network (CNN) models in creating computer vision models makes this model architecture a natural focus for performance optimization efforts.
Likewise, optimization with the use of optimization techniques involving CNN hidden layers.
The inference performance of the Standard VGG16 architecture CNN model leaves significant room for improvement .
, .
With an important machine learning approach inspired by the nervous system in the human mind.
It involves an input layer, a hidden layer, and an output layer, and aims to adjust the optimization type in relation to the weight of each neuron in the ANN following the training process.
The performance of the CNN structure is strongly influenced by the amount and variety of training and optimization data used .
, .
To test this optimization model there needs to be a comparison.
In this comparison with backpropagation, where optimization is done on CNN performance for image classification is usually done through techniques such as fine-tuning and data augmentation .
, .
Fine-tuning is done by utilizing previously trained weights on different datasets to be applied to a new image classification dataset, while data augmentation is done by enlarging the dataset used to train the model by adding small variations to the training data .
, .
The main difference between optimization techniques such as backpropagation and techniques used in CNN performance for image classification is in their purpose.
The purpose of backpropagation is to determine the most optimal weight by minimizing the loss function defined in the model.
Meanwhile, the techniques used for CNN performance aim to improve model performance on specific image classification datasets, by utilizing techniques such as fine-tuning and data augmentation to improve model accuracy and generalization .
, .
Accepted by editor: 30-03-2023 | Final revision: 31-03-2023 | Online publication: 31-02-2023 Putrama Alkhairi.
Agus Perdana Windarto Journal of Systems Engineering and Information Technology (JOSEIT) Vol.
2 No.
Therefore, doing the CNN architecture VGG16 model for efficient performance and selecting the right optimizer is also very important in CNN and deserves to be researched and tested.
There are various optimizers that can be used, such as Adam and SGD.
Choosing the right optimizer can improve model performance and speed up training .
, .
In the previous research conducted by "Tian Yuan.
Weiqiang Liu et al .
" entitled "High Performance CNN Accelerators Based on Hardware and Algorithm Co-Optimization" resulted in a hardware-oriented CNN compression strategy, it was proposed that the deep neural network (DNN) model be divided into "nopruning layer (NP-layer.
Ay and Aupruning layers (P-layer.
Ay.
The NP layer has regular weight distribution for parallel computation and high performance.
The P-layers are irregular due to trimming, but produce a high compression Uniform and incremental quantization schemes are used to achieve a tradeoff between compression ratio and processing efficiency at small loss of accuracy.
For a hardware accelerator on a single FPGA chip without using off-chip memory, the 27.
5y compression ratio is achieved with a top-5 accuracy of 0.
Whereas the results of research conducted by " Desi Irfan et al .
"Comparison Of SGD.
Rmsprop.
And Adam Optimation In Animal Classification Using CNNs" show that the CNN method with the Adam optimization function produces the highest accuracy compared to the SGD optimization method and RMSprop.
The model trained using the Adam optimization function achieved an accuracy of 89.
81% on the test, demonstrating the feasibility of the approach .
Thus, the results of the analysis of previous research and the results of the author's analysis, the author created the CNN model of the VGG16 architecture to streamline performance with the Adam optimizer to improve model performance and speed up training.
For this reason, this model was tested by comparing it with the backpropagation algorithm model with the SGD optimization function.
With a case study of the classification of mammals.
Method In this study a mammal species classification system was designed to determine accuracy using digital image processing methods.
Figure 1 shows the block diagram of the system designed in this study .
, .
Figure 1.
General System Block Diagram In general, the systematic system block diagram shown in Figure 1 is Animal image collection, preprocessing with resizing and data augmentation stages.
Training, and Testing 1 Image Collection The data used in this research is secondary data.
According to the Big Indonesian Dictionary, animal data is a synonym for animal or animal.
According to the World Wildlife Fund, many rare animals such as Sumatran elephants.
Asian elephants.
African elephants, blue whales, hawksbill turtles, orangutans.
Javan rhinoceros, dugongs, hippos, turtles, polar bears, penguins and many others.
The data is sourced from Kaggle.
The reason the author takes data from Kaggle is because of the reliability of the dataset that has been tested.
Keep in mind, using high-resolution training images can also be used to get better accuracy .
Figure 2.
Animal Pictures 2 Pengamatan Awal Results To have a foundation that we can build on a comparison of how well our model is prohibited from doing something naturally, we used pre-trained VGG-19 with the following structure:
DOI: https://doi.
org/10.
29207/joseit.
Putrama Alkhairi.
Agus Perdana Windarto Journal of Systems Engineering and Information Technology (JOSEIT) Vol.
2 No.
Table 1.
Hyperparameter Model CNN
Layer
Batch Size
Crop size
Input Layer
Global Average pooling 2d layer
Dropout
Output Shape
3 x 64 x 64
64 x 4 x 4
3, 8, 3, 1, 1
In all the experiments we've done, the only change we've made is to replace the Optimization Function with Adam.
RMSProp, and SGD with the appropriate Learning Rate.
3 Preprocessing.
After the image collection process, pre-processing is carried out to optimize image quality, as well as facilitate and enhance the system's ability to identify objects.
Pre-processing augmentation is done by resizing and augmenting data .
4 Training.
At the training stage, a learning process is carried out on the image which then produces a model output that will be stored for use in the testing process.
Modeling is the process of training and training image data in identifying objects and categorizing them according to their class .
Figure 3.
The Flowchart of the Stages of the Training System refers to VGG16 In this study the method refers to the very popular VGG16 architecture and has been tested using 2 layers shown in Figure 3.
The input size is 64 x 64 x 3.
In the first convolution using a total of 10 kernels with a 3x3 matrix with a valid padding value.
ReLu activation used in this convolution process as non-Linearity.
The pooling process, especially max pooling, uses a size of 2x2 and in the second convolution stage uses a total of 20 kernels with a 5x5 matrix, uses ReLU with a valid padding value.
Next, flatten is to change the output of the convolution process in the form of a matrix into a vector which will be forwarded to the classification process using MLP (MultiLayer Perceptro.
with a predetermined number of neurons in the hidden layer.
In SGD Optimization and Adam will be applied to nodes for weight optimization and bias with the default Learning rate using the softmax activation function according to the number of classes, in this study there were 5 classes of The image class is then classified based on the value of neurons in the hidden layer using the softmax activation function.
5 Testing.
Fig.
4 shows the flowchart of the system testing phase.
The testing phase is the process of classifying animal species by testing the test image data and comparing it with the results of the training model from the training image data stored in the database.
Image data was taken as much as 1000 for the original data then 3000 augmentation image data.
The image that has been taken will be processed by the CNN algorithm to produce system output in the form of Animal Class information .
, .
DOI: https://doi.
org/10.
29207/joseit.
Putrama Alkhairi.
Agus Perdana Windarto Journal of Systems Engineering and Information Technology (JOSEIT) Vol.
2 No.
Figure 4.
Flowchart of System Testing Stage 6 Testing.
Gambar 5 adalah proses Backpropagation yang digunakan untuk memperbarui bobot dalam jaringan saraf .
Figure 5.
Weight and Bias Optimization Process In Figure 5 it can be seen that the gradient of the parameter model is sampled iteratively, behind the direction of the network weights, to find a new weight that minimizes the error value in terms of classification.
The first optimizer proposed is Adaptive Moment Estimation (Ada.
, where at every iteration t.
Adam updates the model parameters as follows .
, .
= 1 * m.
* ONf.
,x.
) v.
= 2 * v.
* (ONf.
,x.
))^2 w.
= w.
- * m.
/ (Oov.
A) where m.
and v.
are the first and second moments of the gradient at iteration t, 1 and 2 are the moment coefficients.
A is a small number to prevent division by zero, and ONf.
,x.
) is the gradient of the objective function f at point w.
for the training data sample x.
Result and Discussion Experiments on systems that have been designed using the CNN method with architecture referring to the VGG16 architecture and compared with backpropagation in image processing in determining the type of animal from the dataset are divided into five classes, namely cats, bulls, elephants, horses and sheep with consideration.
The test system is formed by utilizing hyperparameter changes in the data before augmentation and after The hyperparameters used are changes to the Optimizer type, namely Adam for the CNN optimization model and SGD for the backpropagation optimization model, changes to the batch size, namely 128, changes to the best learning rate value in each model, namely 0.
001 for the CNN VGG16 architecture, 0.
for the backpropagation model and the number of training iterations.
in this case using an early stop.
1 Data Testing dan Analysis The first data to be tested is the original data of 9875 images.
In the training process, the data used was 0.
out of a total of 8138 data.
While in the process of testing the data used amounted to 20% of the total data or 1737 data.
This test uses three optimizers, namely Adam for CNN with a learning rate of 0.
001 and SGD with a backpropagation learning rate of 0.
1 with parameters, namely batch size 128, for Epoch here using an early stop, where when the model cannot get accuracy and loses again, the process automatically stops.
DOI: https://doi.
org/10.
29207/joseit.
Putrama Alkhairi.
Agus Perdana Windarto Journal of Systems Engineering and Information Technology (JOSEIT) Vol.
2 No.
Figure 6.
Graph of CNN Training and Testing Optimization and Backpropagation In Figure 6 it can be seen that the graph of the results of the training and testing process learning rate data is 001 CNN and 0.
1 for backpropagation is good, the red graph shows the training process and the blue graph shows the Testing process touching consistently which indicates that the model is working not overfit.
Confusion Matrix Figure 7 shows a plot of the confusion matrix results from the training process using two optimization On the left side of the plot are the True Labels of the 5 animal classes which are the actualization of the actual animal classes and at the bottom are the Predicted labels which are predictions of the training process.
Figure 7.
Confusion Matrix ADAM for CNN and SGD for Backpropagation Visualizes data To see how the prediction visualization results from the built model will be displayed in the form of an image.
As explained in the previous explanation that the image data taken was 128 images.
To set the display, dimensions of 5 rows and 8 columns are used which only display 40 images.
DOI: https://doi.
org/10.
29207/joseit.
Putrama Alkhairi.
Agus Perdana Windarto Journal of Systems Engineering and Information Technology (JOSEIT) Vol.
2 No.
Figure 8.
Predicted image visualization using CNN with Adam Optimizer From Figure 8 it can be seen that the actual label will be compared with the predicted label where if the actual label and predicted label show the same class, it will be green, in other words True Positive.
And if the actual label and predicted label show a different class, it will be a False Positive in red.
Prediction errors can occur due to image factors which can be caused by several factors such as test data in this case in the form of unclear images, background influences such as properties, similarity of colors and shapes, and so on.
As in the first row fourth column example, there is a decapitated tiger, the larger machine is represented by an elephant.
This can be overcome by adding a tiger image without a head to the training data.
Improvements in the accuracy and learning speed values can be seen in table 2.
Comparison of Apple to Apple, using the same dataset.
In table 2 below are the test results of the two models.
Table 2.
Comparison of two Adaptive Learning Rate Models Algoritma
CNN
Backpro
Optimasi
Adam
SGD
Test Score
0,6885
0,6160
Test Cost
0,8803
1,0946
Best Epoch
Precision 0,9474 1,0000 Recall 0,9000 0,6118 F1 Score
0,9231
0,5838
Accuracy 91,18% 88,89% In table 2, the results of testing the animal image dataset above can be seen in the model which shows an increase in accuracy in the CNN architecture VGG16 model with Adam optimization, a significant increase of around 2%.
For clearer results, you can see the graph in Figure 9.
Comparison of CNN and Backpropagation Models Test Score Test Cost Precision CNN Adam 0.
Recall F1 Score Accuracy Backpro SGD 0.
Figure 9.
Comparison of the Optimized CNN model and standard Backpro DOI: https://doi.
org/10.
29207/joseit.
Putrama Alkhairi.
Agus Perdana Windarto Journal of Systems Engineering and Information Technology (JOSEIT) Vol.
2 No.
In Figure 9.
It can be seen that the problem with the speed of the backpropagation learning process and the very long time for convergence can be seen in the test cost and minimum local problems that make artificial neural networks (ANN) often stuck at the local minimum have started to be resolved, and the accuracy value is improved by this model can be applied.
In the model proposed by the author in improving the accuracy value obtained by using Adam's optimization with a learning rate of 0.
001 with the highest accuracy value of 91.
Conclusion In this study, we can recommend a solution model to help future researchers or writers identify and monitor animal species more accurately, with an accuracy of 91.
18% for the best model.
The solution offered is that it can assist in monitoring animal species, especially those that are protected in the future, cheaper, faster and more This research also proves that with an appropriate learning rate in each estimation function.
Adam is superior to SGD.
Suggestions for further research are to compare the effect of the activation function with the same dataset and optimizer in this study to further improve the performance of the previous model.
Acknowledgements Thank you to Mr.
Agus Perdana Windarto for his support in carrying out this research and for his guidance and continued support.
References