JOIV : Int.
Inform.
Visualization, 9.
- January 2025 23-28
INTERNATIONAL JOURNAL
ON INFORMATICS VISUALIZATION
INTERNATIONAL
JOURNAL ON
INFORMATICS
VISUALIZATION
journal homepage : w.
org/index.
php/joiv Classification of Diabetic Retinopathy Based on Fundus Image Using InceptionV3 Agus Eko Minarno a,*.
Andhika Dwija Bagaskara a.
Fitri Bimantoro b.
Wildan Suharso a Department of Information Technology.
Universitas Muhammadiyah Malang.
Malang.
Indonesia Informatics Engineering Department.
University of Mataram.
Mataram.
Indonesia Corresponding author: *aguseko@umm.
AbstractAiDiabetic Retinopathy (DR) is a progressive eye condition that can lead to blindness, particularly affecting individuals with It is commonly diagnosed through the examination of digital retinal images, with fundus photography being recognized as a reliable method for identifying abnormalities in the retina of diabetic patients.
However, manual diagnosis based on these images is time-consuming and labor-intensive, necessitating the development of automated systems to enhance both accuracy and efficiency.
Recent advancements in machine learning, particularly image classification systems, provide a promising avenue for streamlining the diagnostic process.
This study aims to classify DR using Convolutional Neural Networks (CNN), explicitly employing the InceptionV3 architecture to optimize performance.
This research also explores the impact of different preprocessing and data augmentation techniques on classification accuracy, focusing on the APTOS 2019 Blindness Detection dataset.
Data preprocessing and augmentation are crucial steps in deep learning to enhance model generalization and mitigate overfitting.
The study uses preprocessing and data augmentation to train the InceptionV3 model.
Results indicate that the model achieves 86.
5% accuracy on training data and 82.
accuracy on test data, significantly improving performance compared to models trained without data augmentation.
Additionally, the findings demonstrate that the absence of data augmentation leads to overfitting, as evidenced by performance graphs that show a marked decline in test accuracy relative to training accuracy.
This research highlights the importance of tailored preprocessing and augmentation techniques in improving CNN models' robustness and predictive capability for DR detection.
KeywordsAiDiabetic retinopathy.
fundus image.
convolutional neural network.
Inceptionv3.
data augmentation.
Manuscript received 27 Sep.
revised 19 Oct.
accepted 14 Dec.
Date of publication 31 Jan.
International Journal on Informatics Visualization is licensed under a Creative Commons Attribution-Share Alike 4.
0 International License.
DR occurs due to swelling and distortion of the blood vessels connected to the retina .
Severe DR occurs because there are too many blockages in the blood vessels connected to the retina .
Proliferative DR occurs due to blockage of small retinal blood vessels forming new vessels in them, which can damage the retina .
Research on fundus image classification of Diabetic Retinopathy using machine learning and deep learning has been done before.
In the study entitled AuDetection of Diabetic Retinopathy using Convolutional Neural NetworksAy, the researcher proposes a digital diabetic retinopathy detection system using Convolutional Neural Networks.
The dataset used is derived from DR1 and MESSIDOR.
The resulting accuracy of the study was 74.
04% .
In the research entitled "Diabetic Retinopathy Detection Using Transfer Learning and Deep Learning," the researcher proposes a pre-trained model with the Inception-ResNet-V2.
The dataset used in this study was from MESSIDOR-1 and APTOS 2019 Blindness Detection.
The resulting accuracy of INTRODUCTION Diabetes is a metabolic disease that causes high blood sugar levels.
Based on WHO data, in 2015, there were 415 million people with diabetes worldwide, and that number is predicted to increase to 642 million people by 2040 .
Diabetic Retinopathy is an eye disorder that can cause blindness in diabetics.
This is because some injuries to the eye can damage the retina of the eye .
Diabetic Retinopathy can be diagnosed using digital retinal images .
Fundus examination is believed to be an effective method for detecting abnormal signs in the eyes of diabetic patients.
Based on the level.
Diabetic Retinopathy is divided into five classes, namely No DR .
Mild DR .
Moderate DR (Mediu.
Severe DR .
, and Proliferative DR.
Mild DR is the initial stage of the occurrence of Diabetic Retinopathy which is caused by minimal swelling of the retinal blood vessels .
Moderate construction, and evaluation.
The dataset employed in this study was sourced from the publicly available APTOS 2019 Blindness Detection dataset on Kaggle, which contains 3,662 fundus images categorized into five classes: No Diabetic Retinopathy (No DR).
Mild DR.
Moderate DR.
Severe DR, and Proliferative DR.
Each class varied in the number of images, with the majority belonging to the No DR category.
To ensure a balanced evaluation of the model, the dataset was divided into 85% for training and 15% for testing using the train_test_split library from sklearn, resulting in 3,112 training images and 550 testing images.
The details of the design in presented in Fig.
each dataset is 72.
33% for the MESSIDOR-1 dataset and 18% for the APTOS 2019 Blindness Detection dataset .
In the study entitled "Automated Diabetic Retinopathy Detection and Classification Using ImageNet Convolutional Neural Network Using Fundus Images", the researchers proposed an automated system to detect and classify diabetic retinopathy using the ImageNet model to achieve higher The dataset used is sourced from the Kaggle website with 35,126 images.
This research applies the InceptionV3 architecture to extract in-depth information from multi-layer networks.
The resulting accuracy of this research 8% on train data and 98.
5% on data validation .
In the research entitled "Diabetic Retinopathy Grading System Based on Transfer Learning," the researcher proposes a deep learning method with a pre-trained EfficientNet model.
The dataset is sourced from the Indian Diabetic Retinopathy Image Dataset (IDRiD).
The study resulted in an accuracy of 86% .
In the research entitled AuDiabetic Retinopathy Identification Using AutoMLAy, the researcher used the Recurrent Neural Network (RNN) method to classify Diabetic Retinopathy.
The dataset used is sourced from Kaggle and was released for the APTOS 2019 Blindness Detection The data used are 3662 fundus images divided into five classes.
In this study, the preprocessing technique changes the fundus image size to 224x224, converts the image to grayscale, adds the Gaussian blur technique with a standard deviation 10, and augments data using an Image Data Generator.
This study resulted in an accuracy of 85% for training data and 82% for testing data .
Diabetic retinopathy can be diagnosed manually by a However, the diagnostic process takes quite a long time due to the limited number of experts.
Many sources provide fundus images, which researchers use to create an image classification system.
, .
, .
, .
, .
, .
, .
, .
, .
, .
, .
, .
, .
Many research proposes the classification of diabetic retinopathy using the Convolutional Neural Network method with the InceptionV3 model architecture.
The InceptionV3 model was proposed because it has a more efficient computation .
The proposed model will combine several preprocessing techniques and Hyperparameter Tuning methods.
The preprocessing technique proposed is the preprocessing technique used in previous research .
and Ben Graham.
The use of this preprocessing technique aims to eliminate some variations in the image due to lighting conditions, exposure, etc.
to improve image quality .
In Section II, the Materials and Method outlines the foundational steps and methodologies employed in this study, providing a detailed account of the techniques and processes used to classify diabetic retinopathy.
In Section i, the Result and Discussion presents the outcomes of the implemented system and provides an in-depth analysis of its performance across various evaluation metrics.
Finally, in Section IV, the conclusion obtained from implementing the system will be Fig.
1 Research Design Data preprocessing played a pivotal role in standardizing the quality and format of the fundus images, as medical images often exhibit inconsistencies in focus, lighting, and background noise.
Two preprocessing techniques were applied in this study.
The first, derived from the work of V.
Harikrishnan, involved resizing images to 224x224 pixels, converting them to grayscale, and applying Gaussian blur with a standard deviation of 10.
The second, based on Ben GrahamAos methodology, included resizing, reducing the image's average color to 50% gray, applying Gaussian blur, cropping circular regions, and removing black background These techniques were selected to enhance the input data's quality and minimize variability caused by external factors such as lighting and exposure.
Data augmentation techniques were employed to further address the risk of overfitting due to an imbalanced dataset.
Augmentation included transformations such as rotation by 90 degrees, zooming, shifting, and flipping, which were applied to increase the diversity of the training data This step was crucial in improving the model's ability to generalize across diverse samples and reducing sensitivity to variations in orientation and illumination.
Details of these augmentation processes, including specific parameter values, are presented in Table II.
Hyperparameter tuning was also conducted to optimize the modelAos performance.
This process involved testing various optimizers, including Adam.
Adamax.
SGD, and RMSProp, along with different dense layer sizes and dropout rates.
The optimal hyperparameters were selected based on their ability to maximize performance on the training data while minimizing overfitting, as indicated by the evaluation metrics.
In this study, the best-performing optimizer was SGD, with a momentum value of 0.
II.
MATERIALS AND METHOD
The research design comprised multiple stages, beginning with dataset collection and proceeding through data preprocessing, augmentation, hyperparameter tuning, model The proposed model architecture was based on InceptionV3, a convolutional neural network designed for efficient computation and robust feature extraction.
The model input layer was configured for 224x224 images, followed by a Global Average Pooling layer to address Fully connected layers, including dense and dropout, were incorporated to enhance model flexibility and prevent overfitting.
The output layer employed a SoftMax activation function to classify the input images into five The collected data was carried out in the preprocessing stage.
following steps are carried out for each preprocessing The preprocessing technique carried out Harikrishnan .
resizes to 224x224 pixels, applies the grayscale technique to the original image, and then applies the Gaussian blur technique to the grayscale image with a standard deviation of 10.
The preprocessing technique used by Ben Graham is resizing, reducing the average color of the image, which is then mapped to 50% gray, applying the Gaussian blur technique, cropping circles, and cutting the black part around the fundus image .
Data Augmentation Data augmentation is done to minimize the occurrence of overfitting on a small amount of data by increasing the number of training data samples.
This can make the model immune to image variations, insufficient lighting, and varied orientations by embedding the stages of data preprocessing and data augmentation .
The following is Table II, which contains details of the stages of the data augmentation Dataset The dataset used is secondary data obtained from the opensource Kaggle website with the title APTOS 2019 Blindness Detection .
The dataset contains 3662 fundus images, divided into five classes.
Table I shows the distribution of the amount of data per class.
TABLE I
RESULTS OF ALL PROPOSED SCENARIOS
Class
No DR Mild DR Moderate DR Severe DR Proliferative DR Amount of data
Total
TABLE II
DETAIL OF DATA AUGMENTATION
3662 images Augmentation Rotation Range Zoom Range Shift Range Flipping Fig.
2 shows a sample of data from each class in the APTOS 2019 Blindness Detection dataset.
Value True Hyperparameter Tuning Hyperparameter Tuning is a technique used to determine the optimal combination of parameters for the built Convolutional Neural Networks model.
The use of the hyperparameter tuning technique has a significant impact on the performance of the built model .
In this study, hyperparameter tuning is carried out to find an optimizer that matches the dataset used.
The optimizers to be tested are Adam.
Adamax.
SGD, and RMSProp.
Table i details the parameters used in the hyperparameter tuning process.
Fig.
2 Sample data for each class Split Data The dataset used in this study amounted to 3662 images.
The dataset will be divided into two parts: train data ratio of 85% and test data of 15%.
After splitting the data, the number of images in the data train is 3112 fundus images, and the test data is 550 fundus images.
The process of splitting the dataset uses a library from sklearn called train_test_split.
The data train is used to train the model to recognize the data that has been prepared.
The test data is used to test the data that has been trained using the proposed model.
TABLE i DETAIL OF THE PARAMETERS USED IN THE HYPERPARAMETER TUNING Parameter Dense Layer Dropout Optimizer Comparison Value 25, 0.
Adam.
Adamax.
SGD.
RMSProp Sharma .
previously proposed the optimizer parameter, which several optimizers used to train a model on a diabetic retinopathy dataset.
The selection of the right optimizer will make the model built more optimal for conducting data In addition to finding the right optimizer parameters, this research will also look for the best parameters for dense and dropout values in the fully connected layer.
The dense layer and dropout values used were selected based on research conducted by Harikrishnan .
Preprocessing Data Preprocessing data is a useful step to improve the quality of the image used because using good image quality can improve model performance .
In addition, data preprocessing is intended to obtain a standard image format by processing images in datasets that contain a lot of noise, such as images that are out of focus, have many exposures, and have overexposure and the presence of a black background in the image .
In this study, two different preprocessing techniques were conducted.
The first preprocessing technique is a technique that was applied to a previous study conducted by Harikrishnan .
The second is the preprocessing technique used by Graham .
The Build Model A convolutional neural network is a deep learning algorithm that accepts image input.
In addition.
CNN can determine aspects or objects in an image that machine learning can use to recognize images .
A convolutional These results suggest that combining robust preprocessing techniques with data augmentation significantly enhances the model's capacity to generalize effectively and mitigate Conversely.
Scenario 2, which utilized Ben GrahamAos preprocessing method without data augmentation, exhibited a reduced test accuracy of 76.
Although the preprocessing technique effectively addressed image quality issues such as lighting and exposure inconsistencies, the lack of data augmentation resulted in a less robust model.
This limitation was particularly evident in the model's difficulty in accurately classifying minority classes such as Severe DR and Proliferative DR, as reflected in the lower precision and recall values compared to Scenario 1.
In Scenario 3, where no data augmentation was applied, the results revealed significant overfitting, as evidenced by the stark contrast between training accuracy .
49%) and test accuracy .
27%).
The high-test loss value of 0.
895 further corroborates this finding, indicating the model's diminished generalization capability when exposed to unseen data.
The weighted F1-score of 0.
in this scenario underscores the importance of data augmentation in enhancing model robustness.
At this stage, the implementation process of the InceptionV3 architectural model for testing diabetic retinopathy data is carried out by applying the proposed test Testing on the scenario is carried out using the optimizer, dense layer, and dropout layer parameters that have been obtained from the hyperparameter tuning process.
In the model testing process, the epoch is 75, the batch size is 12, the learning rate is 0.
001, and the decay is divided by the learning rate value by the batch size value.
The model testing results with each scenario will be compared in terms of accuracy, loss, precision, recall, and F1-Score.
The following explains the scenario carried out, as seen in Table V.
Compared with previous studies, as summarized in Table VI, the proposed InceptionV3 model outperformed other methods, such as RNN-based models, which achieved a test accuracy of 82% on the same dataset.
While Scenario 1 slightly exceeded this benchmark, more advanced architectures like ImageNet-based models have demonstrated higher validation accuracies in prior research.
This observation suggests that further improvements could be achieved by integrating more sophisticated preprocessing techniques and optimization strategies.
The comparison between scenarios also underscores the critical role of preprocessing and augmentation strategies in addressing challenges related to class imbalance and improving overall model performance.
The results further emphasized key accuracy, loss, precision, recall, and F1-score trends across all scenarios.
Scenario 1 demonstrated a relatively balanced relationship between training and testing metrics, reflecting effective mitigation of overfitting.
In contrast.
Scenario 3 revealed a significant divergence between these metrics, highlighting the adverse effects of excluding data augmentation.
The weighted average precision and recall values across scenarios also illuminated the challenges of correctly identifying minority classes, particularly in scenarios with limited or no These findings underscore the importance of a balanced and comprehensive approach to preprocessing and data augmentation in medical image classification tasks.
Neural Network consists of 3 layers, including the convolution layer, the pooling layer, and a fully connected layer .
The convolution layer is the first layer in the Convolutional Neural Network architecture.
This layer extracts the features in the image .
Pooling Layer is the second layer in the Convolutional Neural Network This layer reduces the number of parameters and computation time in the Convolutional Neural Network architecture .
The fully Connected Layer is the last layer in the Convolutional Neural Network architecture to connect to the output layer, which acts as a classifier .
The following is the proposed InceptionV3 architecture model, and it can be seen in Fig.
Fig.
3 Model architectural design Inception is a development of the Convolutional Neural Network architecture, which Szegedy first introduced in his research entitled "Going Deeper with Convolutions" .
this study, the InceptionV3 model results from developing the InceptionV1 and InceptionV2 models.
This architectural model uses an input layer with a size of 224x224, which has been adapted to the InceptionV3 model.
The second layer is an InceptionV3 model with a weight of 'Imagenet'.
In the next layer, there is a Global Average Pooling, which aims to overcome the problem of overfitting.
Furthermore, the fully connected layer has dense, dropout, batch normalization, and output layers for five classification classes using SoftMax Hyperparameter Tuning is implemented to provide the best parameters for dense layers, dropout layers, and In this CNN model build stage, the value of dense layers and dropout is determined based on the hyperparameter tuning process results depending on each layer's location.
RESULTS AND DISCUSSION
The results obtained from implementing the InceptionV3 architecture demonstrated varying performance levels across the three tested scenarios, highlighting the impact of preprocessing and data augmentation techniques on model generalization and robustness.
In Scenario 1, which employed the preprocessing method developed by V.
Harikrishnan in combination with data augmentation, the model achieved the highest test accuracy of 82.
This scenario also yielded a weighted average F1-score of 0.
825, indicating a balanced trade-off between precision and recall across all classes.
TABLE V
TESTING SCENARIO RESULTS
After evaluating all scenarios is completed, the recording of the results of all scenarios continues.
This information is used to compare results at the next stage.
The results of the test scenario recap can be seen in Table V.
In this study, the testing model was evaluated against previous studies.
The following is a detailed comparison of the test model's results against those of previous studies, which can be seen in Table VI.
Train
TABLE IV
DESCRIPTION OF TESTING SCENARIO
Scenario Scenario 1 Scenario 2 Scenario 3 Test Weighted Avg.
Scenario
Acc
Loss
Acc
Loss
Precision Recall F1Score
TABLE VI
TESTING SCENARIO RESULTS
Description The model was trained using data from V.
HarikrishnanAos preprocessing technique The model was trained using data from Ben Graham's preprocessing technique The best results from model training against the proposed preprocessing technique, retrained without applying data augmentation.
Method
RNN
Scenario 1 Scenario 2 Scenario 3 (Propose.
Acc Train Loss Test Acc Loss IV.
CONCLUSION
This study proposes the InceptionV3 architectural model by applying several preprocessing techniques to classify diabetic retinopathy in the APTOS 2019 Blindness Detection In addition, hyperparameter tuning is also carried out to get the right combination of parameters to be used in the built model.
The test was carried out to determine the effect of preprocessing techniques and data augmentation on the built model.
In this study, it was found that not all preprocessing techniques can affect the proposed model.
the three proposed scenarios, it was found that the performance of V.
Harikrishnan's preprocessing technique by applying data augmentation had better results than the other two scenarios.
The scenario was trained using the InceptionV3 architectural model, the technique resulted in an accuracy value of 86.
5% for train data and an accuracy of 73% for test data.
The results of the scenario testing managed to exceed the performance of previous studies.
addition, this study proves that several preprocessing techniques and data augmentation have different effects.
can be seen from the results of test scenario 3, where the model that was built without applying data augmentation produces a performance graph that tends to experience excessively high overfitting.
Future research is encouraged to consider the use of other architectures and apply other preprocessing techniques, such as increasing image size and reducing image noise.
addition, it can also increase the size of the dense layer and the batch size used.
Callback techniques can be added to minimize overfitting and underfitting in the built model.
The layer used in the training is a dense layer with 128 neurons, a dropout layer of 0.
5, and the optimizer generated by the hyperparameter tuning technique, namely SGD.
The optimizer is added to the momentum value of 0.
9 and decays by the learning rate value divided by the batch size value.
The following is a graph of each test scenario's accuracy and loss results, which can be seen in Fig.
Fig.
5, and Fig.
Fig.
4 Result of testing scenario 1
REFERENCES