International Journal of Electrical and Computer Engineering (IJECE) Vol.
No.
October 2025, pp.
ISSN: 2088-8708.
DOI: 10.
11591/ijece.
Facial image analysis for autism spectrum disorder detection in toddlers using deep learning and transfer learning Anupam Das.
Prasant Kumar Pattnaik.
Anjan Bandyopadhyay School of Computer Engineering.
Kalinga Institute of Industrial Technology (KIIT) Deemed to be University.
Bhubaneswar.
India
Article Info
ABSTRACT
Article history:
Autism spectrum disorder (ASD) is a neurological illness that manifests itself through restricted and repeated activity patterns, frivolous or recidivist interests or hobbies and consistent handicaps to social interactions and Better results and early intervention are dependent upon the early identification of people with ASD.
Doctors employ a variety of techniques to anticipate autism, including genetic testing, neuropsychological testing, hearing and vision screenings, and diagnostic interviews.
In addition to requiring more time and money, the traditional diagnosis approach makes the parents of children with extensive developmental abnormalities feel too inadequate to disclose their condition.
So, we need a tool that can detect autism early in less time and money.
Machine learning methods can be used to fulfill this criterion.
In this study, deep learning with transfer learning (VGG-.
is used to detect autism through facial images of children and achieved almost 97% accuracy.
The suggested model significantly improves accuracy and saves time and money by using face features in photos of children to identify early autism tendencies in children.
Received Dec 26, 2024 Revised Jun 1, 2025 Accepted Jul 3, 2025 Keywords:
Autism spectrum disorder Deep learning Facial image analysis Machine learning Transfer learning VGG-16 This is an open access article under the CC BY-SA license.
Corresponding Author:
Anupam Das School of Computer Engineering.
Kalinga Institute of Industrial Technology (KIIT) Deemed to be University Bhubaneswar.
Pin-751024.
Odisha.
India Email: anukiit23@gmail.
INTRODUCTION
Autism spectrum disorder (ASD) .
act as a neurological developmental disorder that affects socialization as well as communication.
Early diagnosis is crucial as ASD can impact social, academic, and professional aspects of life.
Many children show signs within the first year, such as reduced eye contact, lack of interest in caregivers, or delayed response to names .
, .
Some may regress between 18Ae24 months, losing acquired skills.
Symptoms vary in severity and impact on functioning, making assessment complex.
Artificial intelligence (AI) and machine learning (ML) .
, .
are revolutionizing ASD diagnosis and treatment, offering faster, more accurate, and scalable approaches by analyzing large datasets and identifying subtle patterns .
, .
Globally.
ASD affects about 1 in 100 children, influenced by genetic and environmental factors .
Diagnosis relies on observing behavior and developmental milestones, with specialists able to provide reliable assessments by age two .
Early intervention significantly improves developmental outcomes .
Ae.
emphasizing the need for prompt treatment to maximize potential.
ASD which was first identified in 2013, is a developmental illness characterized by limited and repetitive behavioral patterns, interests, or hobbies in addition to persistent challenges with social engagement and communication .
Key symptoms of Autism can be seen in Figure 1.
It has superseded the earlier nomenclature for disorders like Asperger's syndrome and autism disorder that were considered to Journal homepage: http://ijece.
Int J Elec & Comp Eng
ISSN: 2088-8708
be on Authe great continuumAy of autism .
, .
Even though autism has probably been around for a while.
Dr.
Leo Kanner provided the first clinical description of the condition in 1943 .
Eleven children, eight boys and three girls, were diagnosed by Dr.
Kanner, the creator of the nation's first pediatric psychiatric program, with what he called Auautistic disturbances of affective contactAy .
Over the Atlantic, at about the same time, a pediatrician from Austria named Hans Asperger was treating a similar set of kids.
Later, a milder version of autism was referred to as AuAsperger syndromeAy in his honor.
Researchers have not determined the specific factors causing autism since they believe that multiple genetic elements alongside environmental factors play a combined role.
The odds of developing autism increase in cases with either genetic abnormalities or a family medical background.
Validating a diagnosis of autism becomes possible when children reach early childhood through systematic behavioral assessments supported by developmental history review.
Behavioral diagnosis for Autism relies on multiple expert professionals including pediatricians along with psychologists and speech-language pathologists.
Early treatment and therapies such as social skills training together with speech therapy and occupational therapy and behavioral interventions help enhance the life experience of the affected persons with autism spectrum disorder despite the condition being incurable.
The Centers of Disease Control and Prevention (CDC) published new data on the frequency of autism in the population of children: 1 in 36 in the United States, 1 in 36 in Arizona .
, .
The autism and developmental disabilities monitoring network sent updated data to the CDC and prevention on March 23.
According to the latest data, 1 in 36 American children aged 8 received a diagnosis of ASD in 2020 .
Compared to the previously stated prevalence of 1 in 44 in 2018, this number indicates an increase as shown in Figure 2.
Figure 1.
Key symptoms of Autism Figure 2.
Prevalence rate of Autism Facial image analysis for autism spectrum disorder detection in toddlers using A (Anupam Da.
A ISSN: 2088-8708 The current techniques for autism diagnosis require extensive costs yet produce subjective results thus causing early intervention to be delayed.
Researchers require an automated and cost-effective detection system that identifies ASD during its initial stages.
This work fills a crucial knowledge gap through VGG-16 deep learning analysis of facial images to develop an efficient, non-contact, and precise model for detecting ASD, which removes conventional diagnosis techniques' requirements.
This research presents an AI algorithm based on deep learning (CNN.
along with transfer learning (VGG-.
for analyzing pictures of toddler faces as an ASD detection tool.
Using a pre-trained VGG-16 model allows the approach to effectively extract ASD-related facial features which leads to 99.
50% classification Using this method healthcare providers gain a faster and more affordable diagnostic technique that needs no surgery.
An online ASD screening tool derived from this solution will help detect ASD early for intervention purposes without needing extensive clinical evaluations.
METHOD
Analyzing ASD is crucial, but diagnosing it can be challenging since it lacks a clinical benchmark, blood test, to determine the issue.
To come up with a conclusion, professionals take into consideration of the developmental history of the teenager.
Analyzing ASD is crucial because, in the absence of a diagnosis, it can cause a great deal of distress and confusion for the unidentified person about a variety of daily This may result in disruptive behaviors and social disengagement.
In response to the need, the accessible data associated with autism and its analysis have been A model has been developed to manage questionnaires in a quick and easy way for In order to analyze autism through a child's facial picture, a dataset of facial images has been used that will help the model to prepare, test, and approve.
Next, using CNNs and added transfer learning techniques, a model has been developed.
Data set In this research, the data was taken off the publicly available Kaggle .
Two different types of datasets have been used in this analysis.
Consolidated is the name of the workout set.
Autistic and nonautistic are its two sub-indices.
Details of the training, validation, and test data are considered as per Tables 1, 2, and 3 accordingly.
Figure 3 illustrates the data pre-processing pipeline used in this study.
along with the method of transfer learning is also applied to recognizing autism in the given study by using child facial photos .
Ae.
Table 1.
Details of the training data used for Classes Autistic Non-Autistic No.
of images Type of images Table 2.
Details of the validation data used for Classes Autistic Non-Autistic No.
of images Table 3.
Details of the test data used for classification Classes Autistic Non-Autistic No.
of images Type of images Figure 3.
Data pre-processing images Int J Elec & Comp Eng.
Vol.
No.
October 2025: 4856-4864
Type of images Int J Elec & Comp Eng
ISSN: 2088-8708
Convolutional neural network (CNN) The foundation of DL, a crucial branch of ML, is neural networks.
An input layer, an output layer, and one or more hidden layers are all present in these networks.
Nodes are connected by thresholds and weights in each layer.
Data is passed to the next layer when a node's output surpasses its threshold.
otherwise, it stays dormant.
Three primary layers are pooling, convolutional and fully connected.
The central constituent of a CNN is where the main processing is done.
In order to identify particular features within an input image, this layer applies a filter, occasionally called a kernel, which is a tiny matrix of weights that moves over the receptive field.
The pooling layer comes after the convolutional layer in a CNN, is an essential component.
Similar to the convolutional layer, the pooling layer also carries out operations that involve sweeping across the input image, but for a different reason.
Fully connected layer which classifies images using the properties collected from the prior layers, is crucial to the final stages of a CNN.
Figure 4 depicts the architecture of the CNN used in this study.
A neuron in a layer above indicates that all other neurons in the layer below it, are connected when it is considered to be fully interconnected.
Figure 4.
CNN architecture Transfer learning By using information from one task or dataset to enhance a model's performance on a separate but related task, transfer learning is a ML technique.
To put it another way, transfer learning makes better use of knowledge acquired in one context to enhance generalization in another.
As shown in Figure 5, applications for transfer learning are numerous and range from deep learning model training to data science regression problem-solving.
That is especially attractive for the latter, considering the volume of data required to build deep neural networks.
Figure 5.
Transfer learning Facial image analysis for autism spectrum disorder detection in toddlers using A (Anupam Da.
A ISSN: 2088-8708 Advantages Shortened training less training time transfer learning can save a vast amount of training time by relying on the knowledge acquired in the past.
Instead of starting from scratch, models fine-tune existing weights, saving computational resources and speeding up development.
Improved performance with less One of the biggest advantages is improved performance, especially when dealing with small datasets.
Pre-trained models can generalize better and avoid overfitting since they have already been trained on large Lower computational costs since the model does not need to learn from the ground up, transfer learning reduces the need for powerful hardware and long training periods, making it more cost-effective and Effective with limited labelled data transfer learning is particularly helpful in scenarios where labeled data is hard to come by.
The knowledge from a related domain helps the model learn useful representations even when the target domain lacks sufficient labeled examples.
VGG16
VGG16 functions as a widely adopted deep learning model which successfully identifies 927 images within its 1,000 category dataset.
Stakeholders choose VGG16 for many deep learning tasks because its defined design together with its simple programming and ability to perform transfer learning.
The network unfolds into 16 composition blocks that include 13 convolutional operations along with 5 maxpooling operations and 3 fully connected operations.
The system analyzes 224y224 RGB photos while using 3y3 convolutional kernel structures with stride set at 1 to obtain precise features from images.
Two types of layers with parallel operations are applied to reduce dimensions through 2y2 max-pooling with a stride of 2.
The filter count increases systematically from Conv-1 .
to Conv-2 .
Conv-3 .
and ends at Conv-4 and Conv-5 .
filters eac.
The model applies 4096 neurons to each of its first two fully connected layers followed by the final 1000-class SoftMax activated layer.
The alternation pattern between pooling and convolution layers leads to enhanced feature extraction abilities while increasing the classification precision.
Transfer learning allows VGG16 to obtain domain-specific accuracy through brief additional training processes.
The combination of VGG16 model's excellent accuracy performance together with its adaptable pretrained weights makes it ideal for use in medical imaging besides acting as a tool for object detection and facial recognition and autonomous systems applications.
The benchmark status of the deep learning framework keeps its place as a standard model that provides a solid combination between depth and operational efficiency with high accuracy across different computer vision applications.
Evaluation of model The performance assessment of classification models uses accuracy together with precision and recall expressed through .
, .
Model accuracy measures the balance between correctly forecasted results against all predictions made on the testing data.
Accuracy determines the performance metrics by counting the total number of forecasted results while accounting for correctly predicted cases.
yaycaycycycycaycayc = ycNycycyce ycEycuycycnycycnycyceyc ycNycycyce ycAyceyciycaycycnycyceyc ycNycuycycayco ycIycaycoycyycoyceyc Precision stands for the quotient between actual positive matches for all of the positive forecasts.
Model shows its capability of recognizing valid examples from a particular field.
ycEycyceycaycnycycnycuycu = ycNycycyce ycEycuycycnycycnycyceyc ycNycycyce ycEycuycycnycycnycyceyc yaycaycoycyce ycEycuycycnycycnycyceyc Recall demonstrates the relationship between actual positive instances and true positive predictions to total class instances.
The measure indicates whether the model properly identifies all relevant examples from an assigned class.
ycIyceycaycaycoyco = ycNycycyce ycEycuycycnycycnycyceyc
ycNycycyce ycEycuycycnycycnycyceyc yaycaycoycyce ycAyceyciycaycycnycyceyc
RESULTS AND DISCUSSION
Convolutional neural networks that also benefit from TL method: the model is trained on 1,470 facial photos of children who are autistic and 1,470 who are not.
selected the features for determining explicitness, affectability, and accuracy of the predicted model.
CNN works with pre-trained VGG16 version of ImageNet, the activator of sigmoid.
Adam optimizer, and a 8-epsilon binary loss function, which are shown in Figure 6.
Int J Elec & Comp Eng.
Vol.
No.
October 2025: 4856-4864
Int J Elec & Comp Eng
ISSN: 2088-8708
Figure 6.
Epochs Rectified linear unit function (ReLU ) This is expressed as .
ycIyceyaycO.
= x .
where: ycIyceyaycO.
is the output of the ReLU function.
x is the input variable.
represents the absolute value of x, which ensures that the output is either x .
f x is positiv.
or 0 .
f x is negativ.
Because of its ease of use and capacity to successfully add non-linearity to the model.
ReLU is frequently used as an activation function in neural networks.
Sigmoid function Because it compresses input values into this range, this function is helpful for binary classification tasks, producing values between 0 and 1 in .
ya = .
1 yce Oeycu The value range of A is from 0 to 1.
where: ya is the output of the function.
yce is the base of the natural ycu is the input variable.
The value range of A is from 0 to 1, referring to that A goes to 0 as x goes to negative infinity and A goes to 1 as x goes to positive infinity.
The following lists of the prediction modelAos outcomes, along with its performance metrics and expectations for learning and adapting are displayed.
Accuracy, losses, and confusion matrix are shown in Figures 7, 8 and 9 respectively: accuracy: 0.
9667, loss: 0.
1011, val-accuracy:
9950, val-loss: 0.
Table 4 displays the classification performance metrics.
Figure 7.
Accuracy Facial image analysis for autism spectrum disorder detection in toddlers using A (Anupam Da.
A ISSN: 2088-8708 Figure 8.
Losses Figure 9.
Confusion matrix Table 4.
Classification performance metrics Classes Accuracy Macro Avg Weighted Avg Precision Recall F1-Score Support The proposed model for detecting ASD using facial images with deep learning and transfer learning techniques (VGG-.
demonstrated high efficiency and accuracy.
The model accuracy in the training and validation was 96.
67% and 99.
50, respectively with corresponding loss values of 0.
1011 and 0.
These results highlight the modelAos strong predictive capability and generalization across The low loss values suggest minimal error in prediction, and the high validation accuracy implies resilience to overfitting.
The confusion matrix as shown in Figure 9, shows minimal misclassifications, indicating the modelAos ability to discern subtle facial feature variations indicative of autism.
Leveraging transfer learning through the pre-trained VGG-16 model reduced computational cost and improved accuracy.
The model effectively handled binary classification tasks using the binary cross-entropy loss function and sigmoid activation, yielding high sensitivity and specificity.
Compared to prior research as shown in Table 5, the model outperformed other approaches regarding accuracy, sensitivity, in addition to efficiency of computation, demonstrating its applicability for real-world scenarios.
Int J Elec & Comp Eng.
Vol.
No.
October 2025: 4856-4864
Int J Elec & Comp Eng
ISSN: 2088-8708
This approach reduces the time and cost of traditional diagnostic methods, offering a non-invasive, deployable tool for early detection of ASD.
However, challenges related to dataset diversity remain, and expanding the dataset and incorporating multimodal data could enhance the modelAos generalizability.
Further research into model interpretability could increase trust among clinicians and families, solidifying its role as a valuable tool for early diagnosis.
Table 5.
Comparison with existing work: model training and validation accuracy Model Jahanara and Padmanabhan .
Proposed model with VGG-16 Training accuracy Validation accuracy CONCLUSION Early detection of autism is crucial for several reasons, as it can lead to significant positive outcomes for individuals on the autism spectrum.
If a machine learning model can do that in less time in an efficient manner, then this will be a great achievement for the medical field.
In this study, a ML classification model with a facial image dataset through CNN-style deep learning and transfer learning (VGG-.
to detect autism has been proposed.
In the future, using this model, an online application can be developed where parents can upload images of their children and find out the possibility of autism at an early stage.
ACKNOWLEDGMENTS
Our sincere gratitude goes out to the MeitY.
Government of India, for their steadfast assistance through the scholarship offered by the Visvesvaraya Ph.
Scheme.
REFERENCES