International Journal of Artificial Intelligence p-ISSN: 2407-7275, e-ISSN: 2686-3251 Original Research Paper Ensemble Stacking Method of Classifying the Stages of Alzheimer's Disease by using MRI Dataset Nagarathna1. Kusuma2. Harsha Huliyappa3 Department of Artificial Intelligence and Machine Learning. BNM Institute of Technology. Bangalore. India. Department of Information Science and Engineering. Dayananda Sagar Academy of Technology & Management. Bangalore. India. Department of Neurosurgeon. Narayana Multispeciality Hospital. Mysuru. India. Article History Received: Revised: Accepted: *Corresponding Author: Nagarathna Email binu@gmail. This is an open access article, licensed under: CCAeBY-SA Abstract: Alzheimer's Disease (AD) is a progressive neurological disorder that gradually impairs an individual's memory, reasoning, and ability to perform daily tasks. Early and accurate diagnosis of AD is essential for effective intervention, yet remains challenging due to the complexity of its This study explores the use of an ensemble stacking approach to evaluate the effectiveness of transfer learning techniques in classifying various stages of Alzheimer's disease. Unlike traditional methods that directly analyze raw brain images, this research implements a preprocessing technique using the Markov Random Field method to extract the brain tissues specifically affected by AD. These segmented brain tissues are then utilized to train base models, consisting of three convolutional neural networks (CNN. with varying configurations. The predictions of these base models are ensembled and further refined through a second-level meta-model to enhance classification accuracy. The proposed ensemble stacking framework was evaluated using an MRI dataset obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI), which contains images categorized into AlzheimerAos disease (AD). Mild Cognitive Impairment (MCI), and Healthy Control (HC) groups. The meta-model demonstrated superior performance, achieving an average accuracy of 97%, along with high precision, recall, and F1 scores. This study highlights the potential of ensemble learning and transfer learning in advancing AD diagnosis, offering a robust and efficient approach for categorizing its various stages based on medical imaging data. Keywords: Alzheimer's Disease. Brain Tissue Segmentation. Classification Accuracy. Ensemble Stacking. Transfer Learning. 2024 | International Journal of Artificial Intelligence | Volume. 11 | Issue. 2 | 62-69 Nagarathna. Kusuma. Harsha Huliyappa. Ensemble Stacking Method of Classifying the Stages of Alzheimer's Disease by using MRI Dataset. International Journal of Artificial Intelligence, vol. 11, no. 2, pp. December 2024. DOI: 10. 36079/lamintang. Introduction Memory loss is the main feature of Mild Cognitive Impairment (MCI) and it is the transitional state between brain aging and AD . An individual with amnestic MCI has more memory problems than a normal individual of their age, but their symptoms are not as severe as those of an individual with AD. AD is a common progressive brain disease affecting elderly people over the age of 65. In its early-stage loss of memory is minor but with later stages, individuals are unable to convey the information and fail to respond to their surroundings. Despite its significance currently, there is no cure & the current treatments cannot stop the progression of AD. But early detection of this disease is more essential for progressive treatment. The sensitivity of biomarkers and the accuracy of the detection techniques play an important part in the accurate diagnosis of AD. So, to reduce the progression of AD it is more significant to diagnose it at its earliest stage. A combination of various modalities like EEG. MRI. PET, and other modalities can assist in improving the accuracy rate of the analysis . - . To find multi-modular bio-markers an exploration is required on the grouping of There is a need to develop better diagnostic tools, which is what this thesis addresses. Such tools can be efficiently provided by many new emerging machine learning techniques which aim to higher the accuracy of the prediction of AD such that the appropriate treatments can be effectively provided to the patient. In this phase of work proposed ensemble stacking model for automatic detection of AD by using MRI dataset of Kaggle. The proposed model consists three base models of three different convolutional neural networks. Each model is trained with different features of MRI images affect AD. The main features of human brain affected by AD are gray matters, white matters. Cerebrospinal Fluid (CSF) . In one study, people with dementia and participants with AD disease had their grey matter volumes examined. They discovered that while there was a reduction in grey matter in dementia compared to cognitively normal individuals, there was a noticeably smaller reduction compared to those with AD disease. In order to distinguish AD from a variety of other neurodegenerative dementias, we sought to ascertain the diagnostic value of an expanded panel of CSF biomarkers. In this phase of work instead of feed entire MRI dataset into a model extract these three main features affect the disease are extracted by applying Gaussian distribution and a Markov Random Field (MRF) . The segmented brain image is used to train three different convolutional The stacking method of ensemble machine learning is one . It comprises combining the predictions from several machine learning models on the same dataset using strategies like bagging and boosting. The answer to this challenge will be another machine learning model that learns when to utilise or trust each model in the ensemble. Instead of using samples from the training dataset, stacking frequently employs other models that are appropriate for the same dataset . , not all decision tree. In contrast to boosting, stacking uses a single model to determine the best way to combine the predictions from the contributing models . nstead of a series of models that adjust the predictions of earlier models, for exampl. Two or more base models, sometimes referred to as level0 models, make up the architecture of a stacking model, together with a meta-model, also known as a level-1 model, which incorporates the predictions of the base models. As opposed to Level-0 Models (Base-Model. , these models are fitted to the training data and have their predictions created. Model at Level 1 (Meta-Mode. : A model that determines the best approach to combine the predictions of the basic models. The meta-model is trained using extrapolations of the underlying models using non-sample data. In other words, the base models receive data that wasn't used to train them, which they then use to make predictions and generate the expected outcomes, which are then utilised as input and output pairs in the training dataset that the meta-model is fitted to. The outputs from the underlying models were fed into the meta-model. As a result, the meta-learner model will yield more accurate results than the sub model. Albright . and the author, used five CNN architectures to create ensemble models as the foundation model. The accuracy of any model that was taken into consideration by CNN was 93. Literature Review Image Preprocessing In this projected work considered the MRI images collected from ADNI. Before feed images into proposed ensemble model pre-process the image to improve the process performance . The details about dataset preprocessing methods like bias correction, normalization and skull stripping are Nagarathna. Kusuma. Harsha Huliyappa. Ensemble Stacking Method of Classifying the Stages of Alzheimer's Disease by using MRI Dataset. International Journal of Artificial Intelligence, vol. 11, no. 2, pp. December 2024. DOI: 10. 36079/lamintang. discussed are presented in one of our papers sections . In this section focus on segmenting processed brain images into WM. GM and CSF. The Gaussian distribution and MRF were employed in the proposed work to extract WM. GM, and CSF. The genuine image is reconstructed using the available data and the MRF, a stochastic process that identifies the local properties of an image. A significant technique for modelling spatial continuity and other properties is the MRF of prior contextual data, and even straightforward modelling of this kind can yield insightful results for the segmentation procedure . The MRF is a conditional probability model in which a voxel's likelihood depends on its surroundings. It is equivalent to an energy-function-based Gibbs joint probability distribution . In order to reduce misclassification errors caused by picture noise. MRF models have been effectively included into a variety of brain MRI segmentation techniques. One node in the lattice P can represent each pixel . r voxe. in a picture. Let m be the total number of image elements . or both a 2D and a 3D imag. , and let x_i represent the intensity value of a single pixel . r voxe. with a position i in an image constructed over a finite lattice P. Let N=N_i . or all iP symbolise a neighbouring system for a lattice P, with N_i standing for a restricted area surrounding i that excludes x_i. The group of nodes that are situated around node i and are separated from its centre by a distance equal to square root of x are known as its neighbours. N_i=. ^'OOP |An. ist(AnpixelA_i,AnpixelA_. ^' ))A^2Or i^'O. Where rz:r>0 is an integer number and dist. is the Euclidean distance between adjacent pixels a and b. Where dist. is the Euclidean distance between neighboring pixels a and b and rOOz:r>0 is an integer number. The first and the second order neighborhoods are the most commonly used neighborhoods in image segmentation. The first order neighborhood consists of 4 nearest nodes in a 2D image and 6 nearest nodes in a 3D image, while the second order neighborhood consists of 8 nearest nodes in a 2D image and 18 nearest nodes in a 3D image Markov random field model can be represented with a graph G=(P,N), where P represents the nodes and N determines the links . lso called edge. that connect the nodes according to the neighborhood relationship. Constructing the Training Model Three distinct CNN architectures for the level 0 base model were taken into consideration in this Each model is designated as EM1. EM2, or EM3. EM1. EM2, and EM3 are the top, middle, and bottom networks, respectively. The deep learning network is taken into account for the meta model. These architectures have a good classification performance because they have already been trained on ImageNet. Figure depicts the ensemble stacking model that has been proposed. The suggested approach took the preprocessed photos into account. With a ratio of 0. 2, the preprocessed images are split into a train dataset and a test dataset. The validation and train subsets of the train dataset are created. For the training dataset, the three base models are each trained independently. The ensemble predictions are then made using the stacking classifier ensemble approach with meta-learner . Ae . EM1 Three 2D convolutional layers with 32, 64, and 128 filters of size 2X make up the EM1 Maxpooling, dropout, and batch normalisation layers came after each convolutional To prevent an overfitting of a model, dropout the 0. 2 percent of neurons here. For four various classifications of AD disease, including non-demented, very mildly demented, mildly demented, and moderately demented, the classification portion of the system is made up of flat, dense layers with four neurons. Use ReLU and sigmoid activation in convolutional and dense layers, respectively. EM2 The EM2 has three 2D convolutional layers, each with 64, 128, and 256 3X3-sized filters. Maxpooling layers came after each convolutional layer. At the end of the feature extraction layers, the dropout layer is introduced. The system's classification component is divided into two flat, dense layers, each with 500 and 4 neurons. Convolutional and dense layer employ Relu and Softmax. EM3 Four convolutional blocks with 32, 64, 128 and 256 filters are placed after the EM3's base layer's two convolutional layers with 16 filters. Four convolutional layers with a ReLU activation function make up the convolutional blocks. Four dense layers with 512, 256, 128 and Nagarathna. Kusuma. Harsha Huliyappa. Ensemble Stacking Method of Classifying the Stages of Alzheimer's Disease by using MRI Dataset. International Journal of Artificial Intelligence, vol. 11, no. 2, pp. December 2024. DOI: 10. 36079/lamintang. four neurons with sigmoid activation function make up the classification component of the The layer description of base zero models and meta learners are given in Tables 2,3 and 4 Table 2. EM1 Layer Description Layers Filter. Stride and Padding Image Input Convolution2D Max Pooling1 Dropout Batch Normalization Convolution2D Max Pooling2 Dropout Batch normalization Convolution3D Max Pooling3 Dropout Batch Normalization 32 filters with stride 1 and padding 1 2x2 max pooling with stride 2 and padding 0 2 dropout Batch Normalization 64 filters with stride 1 and padding 1 2x2 max pooling with stride 2 and padding 0 2 dropout BatchNormalization 128 filters with stride 1 and padding 1 2x2 max pooling with stride 2 and padding 0 2 dropout Batch Normalization Input Image Size Output Image Size Input Image Size Output Image Size Table 3. EM2 Layer Description Layers Image Input Convolution2D Max Pooling1 Convolution2D Max Pooling2 Convolution3D Max Pooling3 Dropout Filter. Stride and Padding 64 filters with stride 1 and padding 1 2x2 max pooling with stride 2 and padding 0 128 filters with stride 1 and padding 1 2x2 max pooling with stride 2 and padding 0 256 filters with stride 1 and padding 1 2x2 max pooling with stride 2 and padding 0 2 Dropout Table 4. EM 3 Layer Description Layers Filter. Stride and Padding Image Input Convolution2D1 Max Pooling1 Convolution2D2 Max Pooling2 Convolution3D Max Pooling3 Convolution3D Max Pooling3 Dropout Batch Normalization 32 filters with stride 1 and padding 1 2x2 max pooling with stride 2 and padding 0 64 filters with stride 1 and padding 1 2x2 max pooling with stride 2 and padding 0 128 filters with stride 1 and padding 1 2x2 max pooling with stride 2 and padding 0 256 filters with stride 1 and padding 1 2x2 max pooling with stride 1 and padding 0 2 Dropout Input Image Size Output Image Size Nagarathna. Kusuma. Harsha Huliyappa. Ensemble Stacking Method of Classifying the Stages of Alzheimer's Disease by using MRI Dataset. International Journal of Artificial Intelligence, vol. 11, no. 2, pp. December 2024. DOI: 10. 36079/lamintang. Meta model the suggested system's meta model is constructed of four convolutional blocks, each of which has three convolution layers and 32, 64, 128 or 256 filters. Following each convolutional block were the maxpooling layers, then the convolutional layers. Make the model run for different optimizers, such as Adam. Adagrad and Rmsprop, to reduce the system loss. The Adam displays strong performance by providing a 97% accuracy rate. After setting up the models, fitting each of the base models to train dataset. The sample snippet for model fitting is given below: EM1. _traindataset,y_traindatase. EM2. _traindataset,y_traindatase. EM3. _traindataset,y_traindatase. After training the base models evaluate their accuracy by using the following code sample. Call prediction of EM1. EM2 and EM3 as PrEM1,PrEM2 and PrEM3 respectively. PrEM1=EM1. predict_accuracy. _tes. PrEM2=EM2. predict_accuracy. _tes. PrEM3=EM3. predict_accuracy. _tes. Finally compute the average prediction by taking the average of PrEM1,PrEM2 and PrEM3: Avgpred=(PrEM1 PrEM2 PrEM. /3 Further train the meta model by AvgPred and Test original dataset and it is given in below: Meta. fit(Avgpred,x_ traindatase. At last evaluate the meta model by using test dataset: pred3=EM3. predict_accuracy(Avgpred ,x_tes. Methodology The proposed ensemble stacking model is shown in figure 1. The NIFTY MRI brain images collected from Alzheimer's Disease Neuroimaging Initiative (ADNI) . are preprocessed by following the process bias correction. Normalization. Skull stripping and segmentation and are discussed in paper . The dataset consists three classes of images AD. MCI and NC. The total number of images belonging to each class are given in Table 1. Table 1. ADNI MRI Dataset Classes MCI ADNI_Dataset_Count Male Female Female < 70 YO Figure 1. Proposed Model Male < 70 YO Nagarathna. Kusuma. Harsha Huliyappa. Ensemble Stacking Method of Classifying the Stages of Alzheimer's Disease by using MRI Dataset. International Journal of Artificial Intelligence, vol. 11, no. 2, pp. December 2024. DOI: 10. 36079/lamintang. Further preprocessed segmented images are divided into train and test dataset. The train dataset are used to train three base models of convolutional neural networks differ by configurations. Finally results of three models are fusion by meta model, it also learning from test dataset and classify the input images into AD. MCI. HC. Finally evaluate the system performance by finding statistical Finding and Discussion This roposed work was carried out using Google Colab. A kaggle MRI dataset with four distinct classesAinon-demented, mildly demented, very mildly demented, and moderately dementedAiwas employed in this experiment. The dataset is split into the train dataset and the test dataset. The base models undergo 10 iterations with a 0. 0001 learning rate using the train dataset to train them. The categorical cross entropy function is used to change the network weight in the level 0 models after they have been trained using images of AD disease using the Adam optimizer. The parameters considered in this experiment are given Table 5. Table 5. Parameters Parameters Optimizer Activation Function Loss function Batch size Dataset Epoch Learning Rate Normalization Pooling Model-1 Adam. Rmsprop. Adagrad. ReLU. Softmax and Sigmoid Categorical cross entropy ADNI Batch Normalization Maxpooling The output of the stackclassifier, which stacks these three model predictions, is used as an input by the meta learner. Run the meta learner for ten iterations using various optimizers, activation functions, and cross entropy loss functions to make it learn from the base model's prediction output and the test data of the original dataset. According to the performance table, the EM1 model provides 96% accuracy, 97. 5% precision, 90. 75 percent recall, and 93. 7 percent f1 score. The EM2 has an accuracy 5%, a precision of 92. 4%, a recall of 83. 25 percent, and an F1 score of 85%. However. EM3 displays 95% accuracy, 965 precision, 95% recall, and 96% F1 score. The table below contains the suggested ensemble model's performance measures. The accuracy of the meta model significantly increased as it was trained using ensemble models for prediction and test data, yielding an accuracy of 97%, with 98. 25% precision, 96. 5% recall, and 97% f1 score. The confusion matrix of ensemble method is given in Table 6. Table 6. Proposed Model Confusion matrix MCI Average Accuracy Precision Recall F1 Score To lower the loss efficiency, the meta model is iterated across 25 epochs using various optimizers, including Adam. Rmsprop, and Adagrad. Adam has superior performance in this regard, and Figure 2 contains the parameter values for accuracy, recall, precision, and f1 score for random epochs with various optimizers Nagarathna. Kusuma. Harsha Huliyappa. Ensemble Stacking Method of Classifying the Stages of Alzheimer's Disease by using MRI Dataset. International Journal of Artificial Intelligence, vol. 11, no. 2, pp. December 2024. DOI: 10. 36079/lamintang. Figure 2. Screenshot Generated for Different Optimizers The screenshots of accuracy and loss history for meta-models are shown in Figure 3 and Figure 4 Figure 3. Meta Loss History Graph Figure 4. Meta Learner Accuracy History Graph Nagarathna. Kusuma. Harsha Huliyappa. Ensemble Stacking Method of Classifying the Stages of Alzheimer's Disease by using MRI Dataset. International Journal of Artificial Intelligence, vol. 11, no. 2, pp. December 2024. DOI: 10. 36079/lamintang. Conclusion Few samples of the MRI dataset was utilized for testing the projected algorithm. Here implemented ensemble stacking model to improve the performance of the system by extracting the feature of input images at two level of classification. At first level of prediction base model is implemented with three convolutional models and at the final level meta model is implemented for final prediction. The model shows an accuracy of 97% and 98. 25% of precission. Projected model is executed utilizing python that was accessible openly. This investigation was led on Intel A center E i7-8750H, 16GB RAM, 64bit Operating framework. NVIDIA GPU. References