APTISI Transactions on Technopreneurship (ATT) Vol.
No.
November 2025, pp.
808Oe822 E-ISSN: 2656-8888 | P-ISSN: 2655-8807.
DOI:10.
ye Optimizing Automated Machine Learning for Ensemble Performance and Overfitting Mitigation Migunani1* .
Adi Setiawan2 .
Irwan Sembiring3 1 Faculty of Academic Studies.
Universitas Sains dan Teknologi Komputer.
Indonesia 2 Faculty of Science and Mathematics.
Satya Wacana Christian University.
Indonesia 1,3 Faculty of Information Technology.
Satya Wacana Christian University.
Indonesia 1 migunani@stekom.
id, 2 adi.
setiawan@uksw.
edu, 3 irwan@uksw.
*Corresponding Author
Article Info
ABSTRACT
Article history:
Automated Machine Learning (AutoML) has revolutionized model development, but its impact on ensemble diversity and overfitting reduction remains underexplored.
This Systematic Literature Review (SLR) analyzes 107 studies published between 2020 and 2024 to explore how AutoML enhances ensemble diversity, mitigates overfitting, and the challenges hindering its integration.
Unlike previous reviews focusing on AutoML or ensemble methods independently, this study synthesizes their intersection and identifies key research trends.
The findings reveal that AutoML improves ensemble robustness through automated hyperparameter tuning, meta-learning, and algorithmic blending while facing trade-offs in computational cost and interpretability.
Four main themes emerge, integration mechanisms .
6%), overfitting mitigation .
2%), performance trade-offs .
6%), and integration barriers .
2%).
Empirical results indicate that AutoML ensembles outperform traditional models by 22Ae41% in accuracy but require approximately 3.
2 times higher computational resources.
Hybrid AutoML and Explainable AI frameworks are recommended to balance accuracy and transparency.
Theoretically, this study advances understanding of the synergy between AutoML and ensemble learning, while practically providing guidance for deploying reliable AI systems in sectors like healthcare, finance, and digital business.
Policy implications align with the EU AI Act and the US Executive Order on trustworthy AI, supporting Sustainable Development Goals 9 and 8.
Submission June 12, 2025 Revised September 4, 2025 Accepted October 14, 2025 Published October 29, 2025 Keywords:
Systematic Literature Review AutoML Ensemble Learning Overfitting Mitigation Enhancing Diversity This is an open access article under the CC BY 4.
0 license.
DOI: https://doi.
org/10.
34306/att.
This is an open-access article under the CC-BY license .
ttps://creativecommons.
org/licenses/by/4.
AAuthors retain all copyrights INTRODUCTION In recent AI research.
Machine Learning (ML) models face the challenge of overfitting, where they perform well on training data but fail to generalize to unseen data, undermining their reliability .
AutoML has emerged to address this issue by automating tasks like algorithm selection, pipeline configuration, and hyperparameter tuning, reducing dependency on expert knowledge and speeding up development .
, .
Additionally, ensemble learning methods such as bagging, boosting, and stacking improve predictive accuracy and mitigate overfitting by combining multiple models to enhance performance and reduce variance .
Ae.
While AutoML and ensemble techniques have been studied separately, their synergy using AutoML to enhance ensemble diversity for better generalization and overfitting mitigation remains an underexplored Journal homepage: https://att.
id/index.
php/att APTISI Transactions on Technopreneurship (ATT) ye gap .
, .
This study addresses this by presenting a SLR of 107 peer-reviewed studies from 2020 to 2024.
Previous SLRs have mainly focused on AutoML .
, .
or ensemble learning independently .
Ae5, 10Ae.
, offering descriptive overviews or general challenges .
, .
Our review critically evaluates the intersection of AutoML-driven ensemble methods, diversity enhancement, and overfitting mitigation, providing a comprehensive synthesis of both theoretical and practical insights .
, .
The findings highlight how automation improves ensemble performance, reduces human bias in model selection, and emphasizes transparency, reproducibility, and sustainability in AI development .
, .
This review reinforces the foundation for evidencebased innovation in automated systems.
The contributions of this work are threefold:
A Novel synthesis, it provides a novel and comprehensive synthesis of mechanisms through which AutoML automates the creation of diverse ensembles to combat overfitting, integrating advanced techniques such as neural architecture search (NAS) and evolutionary algorithms .
with hyperparameter optimization (HPO) methods .
Ae.
and ensemble strategies .
A Critical evaluation and domain insights, the research identify and evaluate the strategies and performance trade offs .
, accuracy gains of 41% vs 3.
2 times computational cost.
applied in different domains such as healthcare .
and finance .
, offering insights beyond descriptive reporting.
A Practical and theoretical relevance, the research generate actionable recommendations for industry practitioners to implement hybrid AutoML ensemble strategies in real world settings, while also addressing prevailing limitations like high computational demands .
, .
and interpretability challenges .
, .
to outline a roadmap for future research.
By critically examining this synergy, our review aims to advance the theoretical understanding of robust ML design and provide a foundation for developing next generation, automated ensemble frameworks that are both high performing and practically viable.
Multidisciplinary insights from computer science, public policy, and social sciences ensure comprehensive analysis of technical and societal dimensions.
This research also contributes to the United Nations Sustainable Development Goals (SDG.
by developing methods for robust and accessible AI.
By automating the creation of reliable ensemble models, it supports SDG 9 (Industry.
Innovation, and Infrastructur.
through the innovation of trustworthy AI tools.
Furthermore, by lowering the barrier to entry for developing high-performance AI, it advances SDG 8 (Decent Work and Economic Growt.
by democratizing expertise and enabling productivity gains across diverse sectors.
RESEARCH METHOD
This study employs a Systematic Literature Review (SLR) methodology, adhering to established guidelines .
to ensure transparency, rigor, and replicability.
The process is structured into three phases, planning, conducting, and reporting, as illustrated in Figure 1.
During the planning phase, the need for this review was established based on the identified research gap concerning AutoMLAos role in enhancing ensemble diversity and mitigating overfitting.
Research questions were formulated using the PICOC framework as see in Table 1 to guide the review, and a detailed protocol was developed.
This protocol specified the search strategy, inclusion and exclusion criteria, data extraction procedures, and quality assessment standards.
Component P (Populatio.
I (Interventio.
C (Compariso.
O (Outcom.
C (Contex.
Table 1.
Summary of PICOC framework Description The reviewed studies encompass applications of Machine Learning and deep learning that integrate ensemble methods with AutoML techniques.
A key focus is the implementation of AutoML to optimize ensemble diversity and mitigate overfitting.
In contrast, traditional ML approaches rely on manual tuning and ad hoc ensemble construction, which often result in limited scalability and suboptimal performance.
Evidence from the literature highlights that AutoML driven ensembles achieve improved performance metrics, reduced generalization error, and enhanced robustness.
The scope of this review covers research published between 2020 until 2024, specifically addressing the intersection of AutoML and ensemble learning.
E-ISSN: 2656-8888 | P-ISSN: 2655-8807
Figure 1.
Systematic Literature Review Steps During the conducting phase, a systematic search was carried out across three major databases.
Scopus.
Ie Xplore, and the ACM Digital Library.
The search strategy applied Boolean logic with keywords derived from the PICOC framework, resulting in the final query: (("Automated Machine Learning" OR AutoML) AND ("Ensemble Learning" OR "Ensemble Models") OR ("Diversity" OR "Model Diversity") OR ("Overfitting" OR "Reducing Overfitting")).
The query was adapted to the syntax of each database to maximize precision and relevance.
The initial search produced a large set of records, which were screened by title and abstract, followed by full text reviews using predefined eligibility criteria as see in Table 2.
Ultimately, 107 primary studies published between January 2020 and December 2024 were included.
The overall process including identification, screening, and selection stages is illustrated in Figure 2.
Table 2.
Inclusion and Exclusion Criteria Inclusion Criteria Exclusion Criteria Studies from academic or industrial settings apply- Non English publications ing AutoML in ensemble diversity or overfitting Research evaluating the effectiveness of AutoML in Studies lacking empirical validation or irrelevant to ensemble learning AutoMLAeensemble integration Most recent or comprehensive version selected in Conference versions when corresponding journal cases of duplicates publications are available The detailed search process and the number of studies identified at each stage are presented in the PRISMA flow diagram in Figure 2.
The study selection Step 5 was performed in two stages namely exclusion based on title and abstract screening and exclusion based on full text review.
The initial screening yielded 107 primary studies which were then subjected to full text assessment.
In addition to the predefined inclusion and exclusion criteria further considerations included study quality relevance to the research questions and thematic Duplicate or highly similar publications by the same authors across different venues were removed.
Following this process 107 primary studies were retained for analysis.
APTISI Transactions on Technopreneurship (ATT).
Vol.
No.
November 2025, pp.
808Ae822 APTISI Transactions on Technopreneurship (ATT) ye Figure 2.
Presents the PRISMA Flow Diagram Illustrating the Study Selection Process During the reporting phase the selected studies were synthesized to identify recurring themes and patterns aligned with the research objectives.
A narrative synthesis approach was adopted to integrate qualitative insights with limited quantitative trends.
The methodology was refined iteratively throughout the process to ensure comprehensiveness and coherence of the findings.
Research Questions (RQ) and Objectives This systematic review applies the PICOC framework to ensure focus and clarity.
The Population covers studies on machine and deep learning, the Intervention explores AutoML techniques that improve ensemble diversity and reduce overfitting, and the Comparison evaluates them against traditional methods.
Outcomes are measured through predictive performance and generalization within studies published from 2020 to 2024, forming the basis for the research questions in Table 3.
RQ ID
RQ1
RQ2
RQ3
RQ4
Table 3.
Research Questions and Motivations Research Question Motivation What is the role of AutoML in generating To systematically examine how AutoML automates and selecting diverse base models to im- the creation of model diversity, which is a critical prove ensemble robustness and accuracy? factor for ensemble success and generalization.
What specific regularization and optimiza- To investigate and catalog the automated strategies tion techniques within AutoML frameworks used to constrain model complexity and enhance are most effective for mitigating overfitting generalization performance.
in ensemble models? How effective are AutoML driven ensem- To quantitatively evaluate whether automation deble models compared to traditional, manu- livers superior or comparable performance, effially constructed ensembles? ciency, and reliability relative to expert designed approaches.
What are the predominant technical and To identify key barriers to adoption .
, compucomputational challenges in integrating Au- tational cost, complexit.
and to synthesize recomtoML with ensemble learning, and what fu- mendations for overcoming them in future work.
ture research directions are proposed? ye
E-ISSN: 2656-8888 | P-ISSN: 2655-8807
These research questions examine key aspects of the AutoML-ensemble learning domain.
RQ1 and RQ2 focus on technical mechanisms, while RQ3 and RQ4 assess performance and integration challenges.
Together, they provide a comprehensive view of automated ensemble modeling, aiming to map existing studies, identify emerging trends, and highlight future research opportunities.
Data Extraction Following the determination of the final set of primary studies, a structured data extraction process was conducted to collect information relevant to the research questions.
Each of the 107 selected studies underwent detailed review using a standardized extraction form, ensuring consistency, completeness, and traceability.
The extraction process targeted four essential properties directly mapped to the Research Questions (RQ).
Table 4 summarizes this mapping, delineating the relationship between extracted data and corresponding research Table 4.
Mapping of Extracted Properties to Research Questions Extracted Property Mapped to Research Question AutoMLAos role in enhancing ensemble diversity RQ1 Techniques employed for overfitting reduction RQ2 Comparative performance of AutoML driven ensembles versus RQ3 traditional models Challenges in AutoML ensemble integration RQ4 The data extraction process was conducted iteratively, with the extraction form refined between reviews to enhance consistency and capture all relevant data.
The following key attributes were systematically extracted from each primary study.
Quality Assessment and Data Synthesis To ensure reliability and validity, a rigorous Quality Assessment was conducted on 107 primary studies to evaluate methodological strength and reduce bias.
The assessment reviewed clarity, experimental design, relevance, and empirical validation, with weak studies excluded from synthesis.
A narrative synthesis was then applied to integrate insights and identify recurring patterns across diverse methods and objectives, forming four central themes aligned with the research questions, which naturally evolved into the following central themes:
A Integration models.
AutoML techniques for enhancing ensemble diversity.
A Reduction and optimization, automated strategies for mitigating overfitting.
A Comparative performance.
AutoML driven ensembles vs.
traditional approaches.
A Integration challenges, technical and conceptual hurdles in merging AutoML with ensemble learning.
Threats to Validity This systematic literature review acknowledges potential threats to validity and outlines the strategies used to mitigate them.
The discussion addresses four commonly recognized aspects in systematic reviews which include selection bias, publication bias, data extraction bias, and generalizability.
A Selection bias, a potential threat lies in the omission of relevant studies due to search strategy limitations.
To mitigate this risk, searches were conducted across three major digital libraries (Scopus.
Ie Xplore, and ACM Digital Librar.
, ensuring broad coverage in computer science.
The search string was derived from the PICOC framework and iteratively refined to balance sensitivity and specificity.
In addition, backward snowballing was applied to identify further studies not captured by the automated search.
A Publication bias, the tendency for journals to prioritize studies with positive or significant results may compromise representativeness.
This was addressed by including high quality conference proceedings, which often report more diverse outcomes, and by explicitly searching for studies highlighting challenges or negative findings in AutoML integration.
APTISI Transactions on Technopreneurship (ATT).
Vol.
No.
November 2025, pp.
808Ae822 APTISI Transactions on Technopreneurship (ATT) ye A Data extraction and synthesis bias, subjective interpretation during data extraction poses a risk of bias.
To reduce this, a structured extraction form was piloted and applied consistently across all 107 studies.
The process was performed by the first author and independently verified by the second author, with discrepancies resolved through consensus.
A Construct and conclusion validity, the focus on studies from 2020Ae2024 ensures topical relevance but may exclude earlier foundational work.
Moreover, given the rapid evolution of AutoML, some recent advancements may not yet be indexed in the selected databases.
While this limits generalizability across the entire history of the field, it reflects the current state of the art within the review period.
Construct validity was strengthened through the use of well defined research questions, the PICOC framework, and a transparent review protocol.
Furthermore, to strengthen internal consistency and reduce analytical bias, triangulation was employed across data interpretation stages, with multiple authors independently reviewing coding outcomes to ensure interpretive convergence.
This collaborative validation minimized subjectivity and enhanced the robustness of synthesized insights, while iterative peer debriefing and transparent documentation reinforced the dependability of findings.
Cross-verification with domain experts ensured alignment with current AutoML practices and The inclusion of inter-rater reliability checks and audit trails added methodological rigor, supporting transparency, reproducibility, and the overall credibility of the systematic literature review.
RESULT AND DISCUSSION
Significant Journal Publications The publication trend shows a research peak in 2022, reflecting strong interest in integrating AutoML with ensemble learning.
The decline in 2023 and fewer studies in 2024 may result from research maturation and publication delays in major databases.
This trend underscores a critical juncture in the fieldAos evolution as the foundational work from 2020 to 2022 has established the potential of AutoML ensemble integration as shown in Figure 3.
It highlights the ongoing shift from broad exploration toward more focused studies addressing challenges such as computational efficiency and interpretability.
Figure 3.
Temporal Distribution of Selected Studies .
Research Themes in AutoML for Enhancing Ensemble Diversity and Mitigating Overfitting.
The data synthesis phase adopts a structured narrative approach to integrate findings from diverse studies on AutoML for ensemble diversity and overfitting mitigation.
Through systematic coding and thematic analysis, it identifies key patterns, trends, and conceptual links that clarify the strengths and limitations of AutoML techniques in enhancing ensemble performance.
As shown in Figure 4, the analysis highlights four main themes with thirteen subtopics forming a comprehensive overview of the field.
E-ISSN: 2656-8888 | P-ISSN: 2655-8807
Figure 4.
Research Taxonomy: AutoML for Enhanced Ensemble Diversity and Overfitting Mitigation Figure 4 shows how AutoML enhances ensemble performance through integration, optimization, comparison, and adaptation, highlighting the balance between automation, model diversity, and transparency in improving ensemble systems.
Integration of AutoML Techniques to Enhance Ensemble Diversity (Integration Model.
The first research theme.
Integration Models, investigates how AutoML methodologies systematically enhance ensemble diversity through the automation of critical design processes.
By automating model selection, hyperparameter optimization, and feature engineering.
AutoML generates architecturally heterogeneous model ensembles with diverse feature representations that would be difficult to achieve through manual design.
A Feature Extraction and Representation Feature transformation is a fundamental process in Machine Learning that improves model accuracy and generalization while reducing computational cost.
Unlike manual feature engineering.
AutoML automates feature generation by exploring a wide range of transformations, uncovering novel and unbiased features that enhance model diversity.
This automation, as demonstrated by MC AURORA .
, promotes greater heterogeneity and forms the foundation of robust ensemble learning.
A Model Diversity and Architecture AutoML reshapes how model diversity is achieved by replacing manual ensemble design with algorithmic search .
Architectural heterogeneity in structures, layers, and hyperparameters drives robustness by capturing complementary data patterns, creating more resilient decision boundaries .
Frameworks such as MOD .
and Neural Ensemble Search .
show that optimizing predictive disagreement and automating architectural exploration improve calibration and robustness.
However, insights from DICE .
reveal that excessive diversity may harm performance, emphasizing that AutoMLAos strength lies in strategically optimizing diversity to enhance ensemble robustness.
A Ensemble Diversity in Targeted Applications Diversity is crucial in high stakes domains such as finance and healthcare where model failure can have serious consequences.
Diverse ensembles serve as risk mitigation by reducing bias and preventing single points of failure.
Frameworks like D SEM .
and DexDeepFM .
show that domain specific diversity improves anomaly detection and recommendation accuracy.
However, diversity must be applied contextually rather than maximized blindly.
The main challenge for AutoML lies in integrating domain constraints and computational scalability .
into its search process.
Future AutoML ensemble design should prioritize customizable domain aware optimization to balance heterogeneity and performance APTISI Transactions on Technopreneurship (ATT).
Vol.
No.
November 2025, pp.
808Ae822 APTISI Transactions on Technopreneurship (ATT) ye Mitigating Overfitting through Automated Model Selection and Hyperparameter Optimization (Reduction and Optimizatio.
The second key theme in the literature focuses on addressing overfitting through automated model selection and hyperparameter optimization rather than manual regularization as see in Table 5.
Overfitting occurs when models learn noise instead of true patterns, reducing generalization .
AutoML tackles this by algorithmically balancing model complexity and expressiveness, creating systems that are not only accurate but also robust and reliable in practice.
Table 5.
Comparison of Hyperparameter Optimization Techniques for Mitigating Overfitting Technique Key Mechanism Strengths Weaknesses Typical Use Case Bayesian Builds probabilistic Sample-efficient.
Can struggle with Fine-tuning Optimizamodel of the objec- good for expensive high-dimensional models like deep neural tive function Evolutionary Population-based Robust, good for Computationally Exploring very large Algoglobal search non-differentiable expensive, slower and complex search MetaTransfers knowl- Reduces computa- Performance Quick adaptation to new Learning edge from previous tion, faster startup depends on related- but similar problems ness of prior tasks Building on the findings presented in Table 5, three key mechanisms have been identified as central to mitigating overfitting in AutoML-driven ensemble systems .
Each represents a complementary strategy that enhances model robustness, generalization, and efficiency in different stages of the learning pipeline.
A Hyperparameter Optimization Techniques As a core regularization process that manages the bias variance trade off and directly affects model performance and overfitting.
Modern approaches emphasize robustness through techniques like Meta HPO which use adversarial proxy subsets to find hyperparameters that generalize across data variations .
Evolutionary and hybrid strategies combining evolutionary algorithms with Bayesian optimization improve exploration and prevent local optima that cause overfitting.
For models such as CNNs effective regularization through HPO must align with the architecture to enhance efficiency and generalization without reducing accuracy.
A Model Selection and Evaluation Act as safeguards against overfitting by testing model validity on unseen data.
Techniques like Dynamic Fitness Evaluations improve generalization assessment through repeated cross-validation, ensuring robustness rather than chance-based success .
Emphasizing parsimony through joint optimization of features and hyperparameters promotes simpler, more efficient models that resist noise.
This principle supports transparency and interpretability, which are crucial for reliable applications in sensitive fields such as healthcare.
A Transfer Learning and Cost effective Solutions The high computational cost of hyperparameter optimization and model selection is a major barrier to scalable AutoML.
Transfer learning and frugal optimization offer practical solutions by improving generalization and efficiency in limited-resource settings .
Hyperparameter transfer uses prior knowledge to reduce overfitting on small or noisy datasets, while frugal optimization balances accuracy and cost through efficient resource allocation.
Together, these strategies make AutoML more robust, accessible, and sustainable in real-world applications .
Comparative Performance of AutoML Driven Ensemble Models versus Traditional Approaches (Comparing Model.
AutoML driven ensemble models represent a paradigm shift in Machine Learning, systematically outperforming manually crafted ensembles by automating the most complex and subjective aspects of the model development lifecycle.
This automation of algorithm selection, hyperparameter tuning, and feature engineering transcends mere efficiency gains, it fundamentally enhances the search for global optima in the ye
E-ISSN: 2656-8888 | P-ISSN: 2655-8807
model space, leading to superior predictive accuracy, robustness, and operational reliability across diverse The proven efficacy of these systems in high stakes industries like finance, healthcare, and digital business is not merely incremental it validates AutoML ensemble synergy as a critical enabler for deploying robust, generalizable AI in real world environments.
The following analysis deconstructs the sources of this superior performance.
A Enhanced Model Performance The performance advantage of AutoML ensembles stems from their ability to optimize the entire modeling pipeline in a unified, data-driven process.
Automated pipeline optimization jointly refines preprocessing, feature generation, and algorithm selection for greater performance gains.
Techniques like ADMM-based configurators and Dynamic Ensemble Selection (DDES) improve adaptability by selecting the most competent models for each input.
Methods such as DEFEG further enhance feature generation and architectural flexibility, resulting in more accurate and interpretable ensemble models on complex datasets.
A Robustness and Uncertainty Estimation In high-stakes domains, reliability is as important as performance, and AutoML ensembles build trust through robustness and calibrated uncertainty estimation.
Methods like Neural Ensemble Search and NAS enhance diversity, allowing models to capture complementary data patterns and improve prediction confidence.
Ensemble Knowledge Distillation further reduces computational costs by compressing ensemble knowledge into a single model, maintaining high generalization and efficiency for reliable AI deployment in sensitive applications such as healthcare.
A Application Specific Improvements The strength of AutoML ensembles is best demonstrated in domain-specific challenges where traditional methods struggle.
In drug discovery, the SYNPRED model shows how AutoML-driven ensembles enhance accuracy and reveal complex biological patterns through multi-model integration.
Its web-based application highlights the importance of interpretability and accessibility, enabling domain experts such as medical researchers to make informed, data-driven decisions.
A Generalization and Efficiency Balancing generalization and efficiency in AutoML ensembles is essential for scalable and reliable performance.
Techniques like Dynamic Fitness Evaluations help reduce overfitting by ensuring consistent results on unseen data.
Recent research emphasizes efficiency-aware optimization to create models that balance accuracy with computational and energy constraints, supporting deployment in resource-limited Despite outperforming traditional methods.
AutoML ensembles face challenges such as high computational cost and limited interpretability, especially in regulated domains.
Future frameworks must maintain strong performance while enhancing efficiency, transparency, and scalability for broader adoption.
Challenges in Integrating AutoML with Ensemble Learning (Integration Challenge.
The integration of AutoML and ensemble learning, while powerful, is not a panacea.
It represents a fundamental trade off, the pursuit of ultimate robustness and performance through automation and aggregation comes at the cost of severe technical and operational complexities.
This integration effectively creates a Aysystem of systems,Ay where the challenges of both paradigms are compounded, giving rise to three core conflict areas that must be navigated for successful deployment.
A Computational Complexity Balancing generalization and efficiency in AutoML ensembles is vital for reliable and scalable performance.
Advances like Dynamic Fitness Evaluations improve generalization by reducing overfitting during the search process.
Recent studies emphasize efficiency-aware optimization to develop models that balance accuracy with computational and energy limits, enabling use in edge and resource-constrained Although AutoML ensembles outperform traditional methods, they face challenges in computational cost and interpretability.
Future research should focus on frameworks that combine high performance with efficiency, transparency, and scalability for wider adoption.
APTISI Transactions on Technopreneurship (ATT).
Vol.
No.
November 2025, pp.
808Ae822 APTISI Transactions on Technopreneurship (ATT) ye A Diversity Management While diversity underpins ensemble robustness, generating it automatically remains challenging.
AutoML must optimize meaningful diversity rather than simply maximize variation, as excessive or redundant diversity can harm performance.
The main difficulty lies in defining diversity metrics that truly improve generalization and in managing model aggregation to select and weight models effectively.
maintain efficiency and robustness.
AutoML frameworks need automated mechanisms to prune redundant models and retain only those that contribute to ensemble performance.
A Domain Specific Adaptation AutoML promises generality but its full potential with ensembles is achieved through domain-specific In complex areas like genomics and healthcare, generic feature selection may yield statistically valid yet meaningless results, creating fragile models.
Integrating domain-informed selection and domain-adaptive ensemble learning can improve transferability across contexts.
Future research should emphasize frugal and multi-fidelity optimization, meta-learning for knowledge transfer, and methods like Ensemble Distribution Distillation (EnD2 ) to lower computational costs.
Rather than maximizing diversity, next-generation frameworks should pursue task-aligned diversity and incorporate human-in-the-loop processes to ensure interpretability and real-world applicability.
AutoML-ensembles significantly outperform traditional models, with accuracy gains ranging from 22% to 41% depending on the domain as see in Table 6.
This is primarily due to their capacity for automated feature engineering and hyperparameter optimization .
The most substantial gains are observed in domains with well structured data, such as financial forecasting .
However, in healthcare, where data is high dimensional, noisy, and often requires nuanced feature interpretation, they show more modest and variable results .
Domain Finance Healthcare Digital Business Table 6.
AutoML Ensemble Performance by Domain MAccuracy Gain vs Traditional Key Challenge 41% Computational cost Interpretability 35% Data heterogeneity This performance disparity arises because off the shelf AutoML struggles with domain specific feature extraction, often requiring specialized hybrid approaches that integrate AutoML with domain specific ontologies or knowledge graphs .
, .
These integrations guide the feature engineering process, allowing AutoML to leverage expert knowledge and overcome the Aoblack boxAo limitation .
, .
Figure 5.
Distribution of Research Themes Figure 5 shows that research on AutoML and ensemble learning is dominated by studies on model comparison and optimization .
, .
Fewer works address integration models and challenges, showing that implementation and interpretability are still developing areas .
, .
E-ISSN: 2656-8888 | P-ISSN: 2655-8807
ye Positioning Against Existing Systematic Literature Reviews This review distinguishes itself by specifically investigating the synergistic potential of AutoML and ensemble learning to automate diversity enhancement for overfitting mitigation .
, .
While several valuable systematic reviews (SLR.
on AutoML exist, they do not deeply explore this critical intersection .
The Table 7 below summarizes the focus of related SLRs and positions the contribution of this work.
Table 7.
Comparison of Focus Between This Review and Existing SLRs Review Study Primary Focus Scope This Review AutoML for enhancing ensemble diversity & mitigating overfitting Evolution & categorization of general AutoML techniques Application of AutoML in a specific domain .
ecommender system.
Automating the CASH process.
general challenges and types of systems Practical strengths and weaknesses of AutoML tools from a user perspective Focused intersection Addresses AutoML Ensemble Synergy? Yes, core focus Broad, historical Minimally Domain specific Broad, technical Minimally Practitioner oriented Eight Years of AutoML AutoML for Deep Recommender Systems .
Automated ML: State of the Art .
ML Tools: Benefits and Limitations .
Unlike previous studies, this SLR provides a focused synthesis on the intersection of AutoML and ensemble learning.
It analyzes how techniques such as hyperparameter optimization and neural architecture search automate the creation of diverse ensembles while addressing challenges of complexity and interpretability.
This review offers a concise evidence base to guide future research and development of robust AutoML
MANAGERIAL IMPLICATIONS
AutoML has emerged as a transformative paradigm that enhances ensemble diversity, mitigates overfitting, and streamlines Machine Learning development.
By automating feature selection and model optimization.
AutoML reduces manual workload and enables practitioners to focus on strategic, domain-specific problem-solving.
Its integration with cloud and edge ecosystems supports scalable and maintainable infrastructures from data preparation to deployment.
However, realizing its full potential requires addressing regulatory, ethical, and technical challenges.
Frameworks such as the GDPR and HIPAA emphasize transparency, interpretability, and accountability, while dynamic data environments demand adaptive and reliable AutoML To address these challenges, practitioners should adopt user-friendly frameworks like TPOT or H2O.
define clear business problems, and implement pilot projects to build trust.
Advanced teams can refine AutoML outputs to balance efficiency and control, while resource-limited environments can use optimization and early stopping to maintain performance.
Integrating Explainable AI (XAI) tools such as SHAP or LIME ensures compliance and transparency.
Combining explainability, real-time monitoring, and scalability helps organizations build trustworthy and high-performing AI systems.
From a regulatory perspective.
AutoML-driven systems risk being viewed as Aublack boxes,Ay especially in critical sectors.
Hence, integrating XAI as a core pipeline component is essential for compliance and auditability.
Policies like the EU AI Act and NIST guidelines stress explainability and sustainability, urging balance between innovation, fairness, and environmental responsibility in large-scale AI deployment.
AutoML-ensemble deployment intersects with evolving global regulations.
A United States: The Executive Order on AI mandates Autrustworthy AIAy in critical infrastructure.
AutoMLensembles address this through automated bias mitigation and robustness validation .
A European Union: The EU AI Act classifies high-risk AI systems .
, healthcare, financ.
requiring Our findings show hybrid AutoML-XAI frameworks reduce opacity by 40% .
A Global Standards: OECD AI Principles emphasize fairness and transparency.
AutoML-ensembles enhance fairness via automated hyperparameter tuning, reducing demographic bias by 28% .
APTISI Transactions on Technopreneurship (ATT).
Vol.
No.
November 2025, pp.
808Ae822 APTISI Transactions on Technopreneurship (ATT) ye By aligning practical implementation with regulatory and ethical standards.
AutoML can evolve from a promising technological innovation into a globally trusted infrastructure for responsible, explainable, and sustainable artificial intelligence.
CONCLUSION
This systematic review synthesizes findings from 107 studies .
0Ae2.
on AutoML for enhancing ensemble diversity and mitigating overfitting.
The analysis identified four dominant research themes: integration mechanisms, overfitting reduction, performance comparison, and integration challenges.
The collective results confirm that AutoML enables the construction of diverse and generalizable ensemble models through automated feature engineering, hyperparameter optimization, and model configuration.
Building on these insights, realizing the full potential of AutoML ensembles requires addressing key trade-offs between performance, efficiency, and interpretability.
Future research should focus on developing frameworks that are efficient, scalable, and inherently explainable.
To advance AutoML-based ensemble learning, future directions emphasize balancing ensemble diversity with computational efficiency through multi-objective optimization techniques, implementing advanced regularization and pruning mechanisms to reduce redundancy and overfitting, and establishing standardized benchmarking frameworks for fair evaluation and reproducibility.
Further efforts should enhance scalability and deployment by designing lightweight adaptive models suited for real-world applications while embedding explainability as a core design principle through inherently interpretable architectures and transparent post hoc methods such as SHAP or LIME.
Cross-disciplinary collaboration that bridges applied and technical domains will also play a pivotal role in defining practical constraints, inspiring new algorithmic paradigms, and improving the usability of AutoML frameworks.
Through the alignment of these priorities, the research community can advance beyond building functionally powerful AutoML systems toward developing efficient, transparent, and trustworthy ensemble frameworks.
Such efforts will foster responsible and sustainable AI innovation across industries, ensuring that future AutoML applications not only achieve technical excellence but also uphold ethical and societal values in their DECLARATIONS About Authors Migunani (MM) Adi Setiawan (AS) Irwan Sembiring (IS) https://orcid.
org/0000-0002-8551-2157 https://orcid.
org/0000-0002-0140-3560 https://orcid.
org/0000-0002-6625-7533 Author Contributions Conceptualization: MM.
Methodology: MM.
Validation: AS and IS.
Formal Analysis: MM and AS.
Investigation: AS and IS.
Resources: MM and AS.
Data Curation: MM and AS.
Writing Original Draft Preparation: MM and IS.
Writing Review and Editing: MM and IS.
Visualization: MM.
All authors.
MM.
AS, and IS, have read and agreed to the published version of the manuscript.
Data Availability Statement The data presented in this study are available on request from the corresponding author.
Funding The authors received support from Universitas Sains dan Teknologi Komputer.
Indonesia.
Declaration of Conflicting Interest The authors declare that they have no conflicts of interest, known competing financial interests, or personal relationships that could have influenced the work reported in this paper.
E-ISSN: 2656-8888 | P-ISSN: 2655-8807
REFERENCES