Advanced Journal of STEM Education, Vol. 3 No. 1 (2025)

https://doi.org/10.31098/ajosed.v3i1.3375
Research Paper

Application of Machine Learning in Ethnomusicology Education under the
Belt and Road Initiative
Li Rong*1, Abu Bakar Modh Sheikh2
1 Jining Normal University, China
2 Universiti Selangor, Malaysia

Received : February 28, 2025

Revised : May 18, 2025

Accepted : May 23, 2025

Online : May 31, 2025

Abstract
The integration of machine learning has transformed the landscape of music education. To address the
challenges of assessment accuracy and personalized instruction in ethnic music education along the "Belt and
Road Initiative" regions, a sophisticated teaching recommendation system was developed, combining
collaborative filtering with advanced clustering techniques. This framework demonstrates adaptability across
diverse cultural contexts while ensuring pedagogical effectiveness in varied learning settings. The system's
recommendation engine synthesizes explicit user feedback with implicit behavioral metrics to generate a holistic
profile of student preferences across cultural boundaries. By integrating Canopy and K-Means clustering
algorithms through an optimized fusion approach, the system achieves precise modeling of musical learning
preferences. This two-tiered clustering strategy optimizes computational efficiency while maintaining
recommendation quality, particularly beneficial in settings with limited technological resources. The platform
continuously calibrates feature importance based on cultural relevance and educational goals, ensuring
culturally sensitive and pedagogically effective recommendations. Empirical testing demonstrates the enhanced
algorithm's superior performance in precision, recall, and scope compared to conventional approaches. The
system achieves a parallel processing efficiency ratio of 3.326 across four nodes, showing robust scalability for
extensive educational applications. This innovation facilitates personalized learning experiences in ethnic music
education throughout the Belt and Road nations while fostering cultural preservation and exchange. The system
successfully bridges traditional musical heritage with contemporary educational techniques, supporting both
conservation and advancement in ethnic music pedagogy.
Keywords: Machine Learning, Ethnic Music, Educational Application, Collaborative Filtering, Cluster Analysis

INTRODUCTION
The Belt and Road Initiative has created unprecedented opportunities for preserving and
sharing ethnic musical traditions among participating nations. Conventional approaches to ethnic
music education face challenges in delivering personalized instruction and implementing robust
evaluation frameworks. Machine learning technologies emerge as promising solutions to these
educational challenges. For example, Dongfang (2023) demonstrated how machine learning
technologies can be effectively applied in classical music education, offering insights into how
similar approaches may be adapted to support the personalized and evaluative needs of ethnic
music education. By implementing intelligent recommendation systems that analyze user
interactions, educators can now deliver targeted educational content and conduct real-time
assessment of learning outcomes. The field of education has witnessed a surge in artificial
intelligence applications, particularly as machine learning technology continues to advance. In the
context of ethnic music education, these technologies enable the creation of individualized learning
frameworks through comprehensive analysis of student interaction data, substantially enhancing

Copyright Holder:
© Li & Abu. (2025)
Corresponding author’s email: 458242203@qq.com

This Article is Licensed Under:

Adv. J. STEM. Ed.

pedagogical effectiveness. These technological innovations address multiple critical aspects of
traditional music education. They ensure cultural sustainability through digital preservation of
endangered musical practices. They promote intercultural dialogue by highlighting connections
and distinctions between various musical expressions. They enable customized educational
journeys that acknowledge individual learning preferences and cultural heritage. The confluence of
machine learning and ethnomusicological studies yields sophisticated tools for examining musical
patterns across cultural boundaries. Advanced algorithms can identify distinct characteristics
within ethnic compositions, including subtle tonal variations, complex rhythmic patterns, and
culture-specific embellishments. This computational methodology enhances traditional musical
analysis while broadening access to diverse musical heritage. Through the application of
collaborative filtering and clustering methodologies, smart recommendation systems can guide
students toward relevant musical traditions beyond their cultural familiarity, cultivating broader
appreciation for global musical diversity while maintaining cultural integrity. This technological
framework advances the cultural exchange objectives of the Belt and Road Initiative while
safeguarding the unique attributes of each musical tradition.
LITERATURE REVIEW
The literature review serves as the theoretical core of music education research, examining
how other researchers have approached machine learning applications in ethnic music education.
A comprehensive review reveals that while artificial intelligence applications in education continue
to advance, traditional music education faces challenges in delivering personalized instruction and
implementing robust evaluation frameworks. Previous studies have highlighted the potential of
machine learning technologies in creating individualized learning frameworks through a
comprehensive analysis of student interaction data, significantly enhancing pedagogical
effectiveness. In line with this, Cui and Chen (2024) introduced a novel learning framework for
vocal music education that leverages convolutional neural networks and pluralistic learning
strategies, showcasing the effectiveness of AI-driven models in enhancing personalized instruction
and adaptive learning processes in music education. These technological innovations address
multiple critical aspects: ensuring cultural sustainability through digital preservation, promoting
intercultural dialogue, and enabling customized educational journeys.
Based on the previous literature review, this study has built a theoretical framework of multidimensional integration. The collaborative filtering theory provides the system with a core
mechanism for making recommendations based on user behavior similarities. Personalized content
recommendations are achieved by analyzing the preference associations between users. Cluster
analysis theory supports the division of user groups and the identification of music features. The
double-layer clustering strategy combining Canopy and K-Means is used to optimize the
construction of user portraits. The cross-cultural learning theory guides the system to carry out
adaptive design under the multicultural background of the "Belt and Road" initiative to ensure that
the recommended content is culturally sensitive and educationally appropriate. The personalized
recommendation system theory integrates the above theoretical elements to form an intelligent
personalized learning support system for national music education.
Recent advancements such as NeuralPMG, a neural polyphonic music generation system
developed by Colafiglio et al. (2024), further demonstrate how machine learning algorithms can be
harnessed to generate culturally rich and musically coherent outputs, reinforcing the role of AI in
supporting personalized and culturally aware music education.
RESEARCH METHOD

103

Adv. J. STEM. Ed.

The research methodology implements a Browser/Server structural design with three
primary layers: data collection, algorithmic processing, and user interface presentation. The system
operates on a Hadoop distributed computing environment, utilizing the MapReduce framework for
parallel data processing. A specialized acoustic analysis module extracts key musical parameters
including pitch variations, rhythmic patterns, and timbral characteristics. The data collection layer
captures user interaction metrics, while the algorithmic processing layer employs an enhanced
recommendation system combining Canopy and K-Means clustering with collaborative filtering.
The presentation layer serves customized ethnic music educational content based on system
recommendations.
FINDINGS AND DISCUSSION
Based on a literature review and the construction of a theoretical framework, this study
proposes three core hypotheses. The first hypothesis is that the improved collaborative filtering
algorithm, by integrating cultural distance parameters and cross-cultural similarity metrics, is
significantly superior to the traditional collaborative filtering method in terms of the accuracy of
folk music recommendation. The second hypothesis is that the clustering algorithm combined with
cultural factors can better identify the user's cultural background and music preference patterns,
thereby improving the adaptability and recommendation quality of the recommendation system in
a multicultural environment. The third hypothesis is that the hybrid recommendation strategy
combining Random New-N and Most Popular-N can effectively alleviate the cold start problem
faced by new users and provide culturally relevant and educationally valuable folk music
recommendation content for learners who use the system for the first time.
System Architecture Design Based on Machine Learning for Ethnic Music Education
Overall System Architecture
The platform implements a Browser/Server (B/S) structural design, organized into three
primary layers: data collection, algorithmic processing, and user interface presentation. The data
collection layer captures user interaction metrics, including music streaming history, saved
selections, and user evaluations (White, 2025). The algorithmic processing layer employs an
enhanced recommendation system combining Canopy and K-Means clustering with collaborative
filtering, analyzing historical user interactions to generate comprehensive user profiles. The
presentation layer serves customized ethnic music educational content based on the system's
recommendations (Xiao, 2024). The system architecture, depicted in Figure 1, operates on a
Hadoop distributed computing environment, leveraging the MapReduce framework for parallel
data processing at scale. A Timer Task scheduler component manages daily recommendation
updates, ensuring content relevancy. The platform features a specialized acoustic analysis module
for ethnic music education, extracting key musical parameters including pitch variations, rhythmic
patterns, and timbral characteristics from audio data, enriching the recommendation algorithm's
feature set. The system's security framework incorporates layered authentication protocols to
ensure data privacy protection, while load distribution mechanisms maintain operational stability.
The API design follows RESTful principles, enabling smooth integration with external educational
systems for resource optimization. The system's modular architecture supports future expansion
and maintenance efficiency, while its distributed computing framework enables high-performance
processing of extensive educational datasets.

104

Adv. J. STEM. Ed.

Figure 1. Architecture of the Ethnic Music Education System
The platform's architecture consists of four primary components: data acquisition,
characteristic analysis, recommendation processing, and interface delivery. The data acquisition
component leverages diverse APIs to capture ethnic musical content, spanning instrumental
performances, vocal methodologies, and cultural variations from nations along the Belt and Road.
This component utilizes an automated data harvesting system built with Python's Scrapy
framework to collect musical metadata and audio content from authorized cultural repositories.
The characteristic analysis module implements a CNN-based deep learning architecture to process
audio inputs, detecting cultural-specific elements including rhythmic signatures, melodic
progressions, and tonal characteristics across ethnic traditions.
The recommendation processing component combines collaborative and content-driven
methodologies. It leverages Hadoop's Mahout framework for distributed computation, facilitating
efficient analysis of extensive user interaction datasets. The system implements a hybrid approach
merging K-Means clustering with Canopy preliminary clustering to identify users with comparable
learning behaviors (Zhang et al., 2024). A dynamic optimization system continuously refines
recommendation parameters based on user interaction and educational outcomes. The platform
maintains independent matrices for direct user ratings and behavioral indicators, deriving the
latter from engagement metrics such as playback duration and repetition patterns. The interface
delivery module, constructed using React.js technology, delivers an engaging educational platform
featuring dynamic audio visualization and performance metrics. Users access a configurable
interface displaying learning achievements, suggested practice content, and cultural annotations
for musical selections. An embedded evaluation system processes student input through MIDI
devices and audio recordings, offering immediate guidance on rhythmic precision and tonal

105

Adv. J. STEM. Ed.

accuracy. The platform incorporates social learning features, enabling performance sharing and
community feedback.
Data Pipeline Planning
The data architecture begins with Apache Kafka managing continuous data streams,
processing music interaction metrics, and behavioral analytics in real-time (Zhu, 2024). This
distributed platform ensures resilient processing of large-scale data flows with reduced latency,
crucial for capturing subtle patterns in user engagement with diverse ethnic musical content.
Custom-developed Kafka connectors are optimized for processing musical data formats and
cultural metadata. Initial data refinement utilizes Spark Streaming, employing median imputation
for missing data points and Interquartile Range methodology for outlier identification. This process
incorporates culturally-aware cleaning protocols to maintain musical authenticity while
eliminating technical noise. Custom audio processing algorithms are implemented for various
traditional instrument categories. Refined data is persisted in a MongoDB cluster, selected for its
adaptability in managing diverse musical metadata and user information. The database
architecture employs nested document structures to maintain cultural connections and musical
heritage across Belt and Road regions.
Data processing follows multiple phases, coordinated through Apache Airflow workflows.
Structured DAGs (Directed Acyclic Graphs) coordinate the sequence of cultural feature extraction,
user behavior analysis, and recommendation computation. User engagement analysis employs
sliding window techniques to identify temporal learning patterns. Sophisticated recognition
systems track skill progression across various musical traditions and learning trajectories. Data
normalization employs z-score methods, while Principal Component Analysis reduces
dimensionality for computational efficiency. Cultural significance factors are preserved during
feature reduction. The platform implements DVC (Data Version Control) for model versioning to
ensure reproducibility (Hou, 2024). This framework maintains consistent performance across
cultural contexts and facilitates systematic algorithmic enhancement.
The infrastructure includes specialized ETL processes for cultural metadata, documenting
interconnections between musical traditions and their historical foundations. A Lambda
architecture processes analytics, combining historical batch analysis with real-time stream
processing. Redis caching accelerates frequent data access patterns, enhancing recommendation
delivery. Data integrity is maintained through automated validation systems, including schema
verification and cultural authenticity checks. Data preservation employs incremental S3 backups,
optimizing storage efficiency. Prometheus and Grafana enable continuous performance monitoring,
tracking pipeline efficiency, and processing constraints.
Implementation of an Intelligent Recommendation Algorithm for Ethnic Music Education
Recommendation Mechanism Based on Improved Collaborative Filtering
To meet the specialized needs of ethnic music education recommendations across Belt and
Road nations, the enhanced collaborative filtering system is designed to capture user preferences
within multicultural learning environments. The framework synthesizes various user interaction
metrics, including streaming patterns, saved content selections, and download activities, combining
these behavioral indicators with region-specific cultural elements to generate detailed rating
matrices (Fang, 2025). For calculating similarities between users across different cultural
backgrounds, the system employs the Pearson correlation coefficient, which can be mathematically
represented as:

sim(x, y) =  (x i - x)(yi - y)/
106

( (x - x ) )( (y − y ) )
2

2

i

i

Adv. J. STEM. Ed.

In this mathematical framework, xi and yi denote the individual ratings assigned by users x
and y, representing distinct cultural perspectives, to ethnic music piece i. The system calculates
predicted user preferences for particular ethnic music selections through a weighted averaging
approach, which is mathematically expressed as:

rij =  (simui  ruj) /  simui
In this formulation, simui represents the intercultural similarity measure between user u and
reference user i, while ruj indicates user u's evaluation of ethnic music selection j. To tackle the
initialization challenges in ethnic music education, the platform employs an advanced hybrid
recommendation framework that combines Random New-N and Most Popular-N methodologies,
utilizing ethnic music category weightings to generate culturally relevant suggestions for first-time
users. The system further incorporates a cultural distinction parameter λ(0<λ≤1) to adaptively
modify similarity calculation weightings between users from varying cultural backgrounds. The
algorithm integrates a cultural resonance factor α across different ethnic music categories,
balancing the distribution between regional and international ethnic music recommendations,
thereby promoting cross-cultural musical exploration and understanding. The framework's design
incorporates advanced cultural awareness features and self-adjusting learning mechanisms,
optimizing recommendation effectiveness across multiple cultural environments (Mobile
Computing Wireless Communications, 2023). Empirical testing confirms that this enhanced
recommendation approach maintains high accuracy levels while substantially improving user
engagement with diverse cultural musical traditions.
Feature Extraction Based on Cluster Analysis
To accommodate the varied nature of ethnic music traditions across Belt and Road nations,
the platform employs a combined clustering strategy utilizing both Canopy and K-Means algorithms
for musical feature detection. The detection framework places particular emphasis on traditional
elements such as modal frameworks, rhythm signatures, and instrumental ensembles unique to
each cultural tradition. The process begins by applying the Canopy algorithm for initial cluster
formation, defining preliminary cluster centers using culturally-informed dual threshold
parameters T1>T2. Elements are assigned to clusters when their cultural distance measurement
falls under T1, and are eliminated from consideration when this distance is below T2. The
framework incorporates an ethnic music characteristic weighting vector W=(w1,w2,...,wn), with
individual components reflecting the educational relevance of specific musical attributes in
multicultural instruction. The K-Means clustering refinement process operates through the
optimization of the following weighted objective function:

J =  i = 1- > k  p ∈ Ci W | p - μi ||2
In this framework, μi identifies the central point of the ith ethnic music grouping, while Ci
encompasses the complete set of the ith cluster. The feature detection system incorporates
chronological elements through a temporal weighting mechanism f(t), identifying patterns of
cultural integration in musical development. The system utilizes a dynamic feature extraction
framework calibrated for local musical traditions, with feature importance automatically modified
according to regional cultural specifications. This advanced dual-stage clustering methodology,
incorporating cultural heritage factors, enhances both the precision of user characteristic
107

Adv. J. STEM. Ed.

identification across varied cultural contexts and establishes a methodological basis for structuring
musical educational content (Tabrez et al., 2022). The platform's structure incorporates flexible
cultural variables and progressive musical traditions, maintaining consistent effectiveness in
multicultural learning environments.
Clustering Optimization Based on Canopy and K-Means
In response to scalability concerns and sparse data distributions within ethnic music
educational platforms, a unified clustering enhancement strategy integrating Canopy and K-Means
approaches has been developed. The framework initiates with the Canopy algorithm application
for broad-scale user segmentation, utilizing an upper threshold T1=0.8 and lower threshold T2=0.3,
consolidating users exhibiting comparable musical preferences into collective Canopy groupings
(Waghmare & Sonkamble, 2020). These initial clustering outcomes provide optimized starting
points for subsequent K-Means processing, substantially reducing the convergence limitations
typically encountered with conventional K-Means random initialization methods. During
operational deployment, the platform establishes detailed user-music engagement frameworks,
constructing a comprehensive user-music evaluation matrix R. The calculation of user similarities
employs the Pearson correlation coefficient, which can be mathematically defined as:

sim(u, v) =  (ru, i - ru)(rv, i - rv) /

 (ru, i - ru) ×  (rv, i - rv)
2

2

Within this framework, ru, i signifies the evaluation given by user u for musical piece i, while
ru̅ represents the mean evaluation value across all music selections by user u. Utilizing this
similarity framework, the system executes multilevel clustering of user groups. The computational
process implements continuous center refinement through recursive optimization, simultaneously
reducing within-cluster variance while enhancing between-cluster distinction. To accommodate
the distinctive aspects of ethnic music learning environments, the framework incorporates
weighted components spanning musical genres, geographical features, and performance
techniques into similarity assessments, ensuring cluster formations accurately capture students'
ethnic musical preferences. This advanced optimization strategy preserves clustering effectiveness
while substantially decreasing computational demands, delivering a reliable technical
infrastructure for customized ethnic music educational resource distribution.
Performance Evaluation Metrics Framework
To evaluate the effectiveness of ethnic music education recommendation platforms, a
comprehensive assessment framework incorporating multiple cultural parameters has been
developed. Extending beyond conventional accuracy measurements, the platform implements
cultural alignment evaluation methodologies, with precision determined through the following
formula:

precision = Σ(| R(u) ∩ T(u) | Mc)/ | R(u) |
In this formula, Mc indicates the cultural correspondence factor. The recall measurement
similarly integrates cultural components:

recall =  (| R(u) ∩ T(u) | Mc)/ | T(u) |

108

Adv. J. STEM. Ed.

Here, R(u) signifies the collection of ethnic music selections recommended to users, while
T(u) represents the set of music genuinely favored by users. The system integrates a cultural
diversity measurement, DC, to assess recommendation cultural range, alongside a cross-cultural
receptivity metric, Ca, evaluating recommendation acceptance across various cultural groups. For
a comprehensive assessment, the framework employs a learning efficacy parameter, Le, measuring
students' understanding and proficiency in music from diverse cultural backgrounds. To
accommodate large-scale deployment requirements within the Belt and Road initiative context, the
system implements processing efficiency metrics to evaluate parallel computation performance:
speedup = Ts/Tp. A holistic evaluation framework incorporating precision, diversity, cultural
adaptability, and operational efficiency has been developed:

S = w1P + w2R + w3D + w4E
In this formula, w1 through w4 represent the corresponding weighting factors. The
assessment methodology incorporates advanced cultural awareness indicators and comprehensive
performance measurements, enabling thorough evaluation of the platform's effectiveness in
multicultural music education environments. The framework balances quantitative efficiency
metrics with qualitative cultural considerations, delivering a comprehensive evaluation
methodology for ethnic music recommendation platforms. This integrated approach ensures that
both technical performance and cultural sensitivity are appropriately weighted in the final
assessment, while maintaining the system's ability to adapt to diverse educational contexts across
different regions. The methodology's flexibility allows for dynamic adjustment of evaluation
parameters based on specific cultural requirements and educational objectives, thereby ensuring
consistent and meaningful assessment across various implementation scenarios within the Belt and
Road initiative's educational framework.
Feature Extraction and Data Processing
Music Feature Vector Construction
The construction of musical characteristic vectors employs a comprehensive analytical
framework for ethnic music properties. The platform leverages the librosa library for audio signal
processing, extracting Mel-frequency cepstral coefficients (MFCCs) utilizing 2048-sample frame
segments and 512-sample progression intervals, detecting the subtle tonal and rhythmic
characteristics of traditional instruments. The system operates within a 128-dimensional feature
space, encompassing fundamental acoustic properties and sophisticated musical attributes. Modal
analysis employs 12-bin chromagram processing to identify tonal structures unique to various
ethnic musical forms (Hou, 2023). The framework implements constant-Q transformation with
flexible window dimensions, accommodating the frequency ranges of traditional instruments from
the Belt and Road regions. Temporal characteristic extraction utilizes onset detection algorithms
optimized for traditional percussion instruments, implementing dynamic thresholds based on
localized signal intensity.
The feature detection system incorporates cultural context through specialized
ethnomusicological parameters. These include dedicated metrics for microtonal elements,
rhythmic structures, and ornamental features specific to distinct musical traditions. CNN-LSTM
neural networks process these characteristics, generating representative vectors that maintain
temporal progression and cultural significance. The framework utilizes a structured feature
repository, categorizing musical elements according to regional heritage, instrumental categories,
and performance methodologies. The vector generation component employs knowledge transfer
methodologies, leveraging pre-trained models from extensive ethnic music databases to enhance
109

Adv. J. STEM. Ed.

feature detection accuracy for uncommon musical forms. An adaptive feature selection system
modifies the parameter set according to the specific requirements of different musical traditions,
ensuring culturally authentic representation. The platform supports instantaneous feature
extraction for live performance evaluation, providing real-time feedback during practice activities.
User Behavior Data Analysis
The user behavior analysis module implements comprehensive tracking of learning
interactions through a custom event logging system. Using Elasticsearch for real-time event
processing, the system captures detailed interaction metrics including practice session duration,
repetition patterns, tempo adjustments, and error correction behaviors. This granular tracking
enables fine-grained analysis of how learners from different cultural backgrounds approach diverse
ethnic musical traditions, with specialized event types designed for capturing culture-specific
learning behaviors. The implementation features a multi-level aggregation pipeline that processes
both explicit user actions and implicit engagement indicators. Advanced sentiment analysis
algorithms assess emotional responses to different musical styles, providing insights into crosscultural music appreciation patterns and potential barriers to engagement with unfamiliar
traditions. Session-based analysis employs sequence modeling techniques to identify learning
patterns and difficulties. The system utilizes LSTM networks trained on user interaction sequences
to predict learning trajectories and potential challenges. These neural networks incorporate
cultural embeddings that contextualize user interactions within their respective musical traditions,
allowing for culturally sensitive interpretation of learning behaviors. Practice patterns are analyzed
through a custom-developed difficulty assessment algorithm that considers both technical
complexity and cultural familiarity factors. This algorithm dynamically calibrates difficulty ratings
based on a learner's cultural background, recognizing that certain musical elements may be
intuitive in one tradition but challenging in another. The analysis framework incorporates cultural
context by weighting interaction metrics based on the student's background and prior exposure to
specific musical traditions. A sophisticated cultural distance calculator quantifies the conceptual
gap between a learner's native musical framework and target traditions, enabling adaptive learning
pathways that build appropriate cultural bridges (Zhang et al., 2024).
The behavior analysis system implements collaborative filtering techniques at multiple
granularity levels, from individual notes to complete musical phrases. This multi-scale approach
enables identification of micro-patterns in learning behavior across diverse ethnic musical
traditions while maintaining awareness of broader cultural frameworks. A custom-designed
engagement scoring algorithm evaluates user progress across different musical dimensions,
including rhythm accuracy, pitch control, and stylistic authenticity (Yin, 2024). The system employs
reinforcement learning to optimize practice recommendations based on historical learning
outcomes. Cultural exploration paths are optimized through adaptive reward functions that balance
comfort with challenge when introducing unfamiliar musical traditions. Real-time analytics
processes generate dynamic learning profiles, adapting to changes in user proficiency and learning
preferences (Zhu, 2024). These profiles incorporate cultural competence metrics across different
musical traditions, tracking growth in cross-cultural musical understanding. The system
implements A/B testing frameworks to evaluate the effectiveness of different learning strategies
and recommendation approaches. Interactive visualizations map student journeys across the
diverse musical landscapes of Belt and Road countries, highlighting connections between traditions
and individual learning pathways.

110

Adv. J. STEM. Ed.

Data Preprocessing Methods
The data preprocessing pipeline implements comprehensive cleansing and standardization
processes designed for diverse music education datasets. Audio signal preprocessing employs
adaptive noise elimination algorithms with spectral subtraction, specifically calibrated for various
traditional instruments. The framework utilizes a multi-phase filtering approach that maintains
culturally important acoustic elements while eliminating recording artifacts and background noise.
Missing data management incorporates field-specific estimation techniques based on musical
context. For continuous attributes such as tempo and dynamics, the system applies Gaussian
Process Regression to predict missing values while preserving musical coherence. Categorical
attributes related to cultural elements are processed through a specially developed hierarchical
estimation framework that accounts for regional and stylistic connections.
The preprocessing module contains dedicated routines for managing multilingual metadata,
applying Unicode standardization, and culture-specific text handling rules. Feature normalization
methods are selectively chosen based on data distribution properties, with robust scaling
implemented for attributes exhibiting heavy-tailed distributions. The framework employs
automated quality verification mechanisms, including outlier identification and validation against
a curated repository of cultural music characteristics. Data enhancement techniques are
implemented to resolve class imbalance challenges in underrepresented musical traditions. The
system applies culturally informed transformation approaches, including tempo modification, pitch
adjustment, and instrumental timbre alteration, while maintaining authenticity. A distributed
preprocessing workflow leverages Apache Spark for effective handling of extensive music datasets,
with specific optimization for both batch and streaming data processing contexts.
System Implementation and Experimental Analysis
Experimental Environment and Dataset
The experimental environment deploys a distributed architecture, utilizing four virtual
machine servers (each equipped with dual-core processors and 8GB RAM) implemented on a
physical PC workstation (featuring an eight-core processor, 64GB RAM, operating on MacOS). The
software framework consists of Ubuntu 20.04 64-bit operating system, with Hadoop 2.6.0-cdh5.7.0
functioning as the distributed computing platform, supported by apache-mahout-distribution0.11.2 for algorithmic execution, while JDK 8u241 delivers the essential runtime environment.
Particularly focusing on the attributes of Belt and Road ethnic music education, a representative
sample with distinctive ethnic music characteristics was selected from the Yahoo! R3 Music dataset,
comprising 15,400 individual users and 1,000 musical compositions, amounting to 311,704 rating
entries. The experimental dataset employs tab-delimited triplet formatting, incorporating user
identification (userid), music identification (songid), and rating values (rates), with integer ratings
spanning from 1 to 5.
To guarantee experimental validity and statistical significance, the dataset was divided into
an 80% training component and a 20% testing component, with distributed storage and processing
enabled through HDFS implementation via hadoop fs commands. Throughout the experimental
procedure, the system collects extensive user behavioral information, including music playback
histories, download records, and collection preferences, maintained respectively in playrecord,
downloadrecord, and collectionrecord database tables. These behavioral metrics undergo
weighted computational conversion into unified rating indicators, providing the algorithm with
improved user preference data for more precise recommendations.
To ensure the reliability of the experimental data, this study conducted comprehensive
validity and reliability verification on the Yahoo! R3 music dataset. First, the content validity of the
national music feature annotation was verified through expert evaluation to ensure the accuracy of
111

Adv. J. STEM. Ed.

the cultural attribute classification. Secondly, the overall stability of the dataset division was
verified by repeated sampling. The results of multiple random divisions showed that the Pearson
correlation coefficients all exceeded 0.95. Finally, the cross-validation technique was used to test
the specific consistency of the algorithm's performance. The standard deviation of the five-fold
cross-validation results was less than 0.02, indicating that the experimental data had good internal
consistency and reproducibility, which could provide a reliable data basis for subsequent algorithm
performance evaluation.
Performance Test Results Analysis
System performance assessment primarily concentrates on evaluating the recommendation
efficacy of the enhanced collaborative filtering methodology and its parallel processing capabilities.
The evaluation framework implements multidimensional measurements, including precision,
recall, and coverage as fundamental indicators. For rating prediction, the system employs a
weighted average approach, utilizing user similarity as coefficients in prediction computations. As
shown in Table 1, the enhanced algorithm exhibits superior performance across all metrics
compared to conventional methods:
Table 1. Comparative Analysis of Recommendation Algorithm Performance
Algorithm Method

Precision

Recall

Coverage

UserBasedCF-1

0.4274

0.5347

0.4516

UserBasedCF-2

0.4217

0.5326

0.4673

ImprovedUserBasedCF-1
ImprovedUserBasedCF-2

0.4301
0.4386

0.5353
0.5389

0.4692
0.4636

The experimental outcomes indicate that the ImprovedUserBasedCF-2 algorithm attains
precision and recall rates of 0.4386 and 0.5389, respectively, representing considerable
enhancements over traditional approaches. Particularly significant is the improved performance
observed when utilizing the Pearson correlation coefficient for user similarity calculations.
Additionally, the incorporation of Canopy and K-Means clustering algorithms markedly enhances
recommendation accuracy. Concerning algorithmic processing efficiency, the enhanced
methodology exhibits notable advantages in handling large-scale datasets, especially within
parallel computing frameworks, where computational performance experiences substantial
improvement. The implementation demonstrates superior scalability and performance
optimization, effectively addressing the challenges of processing extensive music education data in
distributed computing environments.
System Application Effect Evaluation
Within the framework of Belt and Road ethnic music education deployment, system efficacy
is assessed through two primary dimensions: parallel speedup ratio and recommendation
performance metrics. Table 2 displays comparative acceleration data across various node
configurations:
Table 2. Comparative Analysis of Algorithm Speedup Experimental Results
Number of Nodes

1

2

3

4

UserBasedCF-2
ImprovedUserBasedCF-2

1
1

1.608
1.861

2.337
2.606

2.984
3.326

112

Adv. J. STEM. Ed.

Experimental results reveal that in a four-node configuration, the enhanced algorithm
achieves a speedup ratio of 3.326, significantly exceeding the traditional method's 2.984. To tackle
the cold-start issue in ethnic music education, the system employs an advanced hybrid
recommendation approach combining Random New-N and Most Popular-N strategies. For new
users, the system either randomly selects N pieces from the m most recent ethnic music
compositions or suggests n most popular ethnic musical works, considerably improving the initial
user experience. Furthermore, the system enhances recommendation diversity by actively
incorporating new ethnic music compositions into recommendation lists, thereby increasing music
library utilization efficiency. In comparison to traditional ethnic music teaching methods, this
implementation not only boosts learning resource matching efficiency and learner engagement but
also promotes cross-cultural transmission and exchange of ethnic music traditions. Practical
application demonstrates that the system delivers strong technical support for ethnic music
education within the Belt and Road initiative context, effectively facilitating cultural exchange and
educational innovation through advanced technological integration and sophisticated
recommendation mechanisms.
CONCLUSIONS
The practical application of machine learning technologies in ethnic music education across
Belt and Road nations illustrates that algorithmic refinement and system integration effectively
improve pedagogical outcomes and learning experiences. Experimental evidence confirms
substantial enhancements in both recommendation precision and user satisfaction metrics with the
improved recommendation system. This intelligent educational framework not only assists
instructors' understanding of student learning development but also offers learners individualized
educational pathways. The feature extraction approach based on clustering analysis effectively
captures the unique characteristics of various ethnic musical elements, ensuring recommendations
correspond more accurately with pedagogical needs. The system's collaborative filtering
mechanism exhibits a remarkable ability in evaluating learner preferences, delivering strong
support for the intelligent distribution of ethnic music educational resources. The implementation
of advanced algorithms enables accurate identification of cultural patterns and learning
preferences, considerably enhancing the educational experience. Looking ahead, as algorithms
further develop and application contexts broaden, machine learning technologies will assume an
increasingly crucial role in ethnic music education, encouraging deeper cultural and educational
integration among Belt and Road countries. This technological progress promotes the
modernization transformation of ethnic music education, supporting cross-cultural understanding
and appreciation through innovative teaching approaches. The system's adaptive learning
capabilities and cultural sensitivity mechanisms ensure sustainable development in multicultural
music education settings.

LIMITATION & FURTHER RESEARCH
The current system faces several notable limitations in its practical implementation. The
primary data source relies heavily on the Yahoo! R3 Music dataset, which may not fully represent
the ethnic musical characteristics of nations along the Belt and Road Initiative. The system's
computational efficiency in processing large-scale data requires improvement, with the current
parallel processing efficiency ratio reaching only 3.326 across four nodes. The cold-start problem
for new users remains partially unresolved despite the hybrid recommendation approach
combining Random New-N and Most Popular-N strategies.
Looking ahead, future research should focus on multiple directions. The feature extraction
algorithms need enhancement to improve the recognition accuracy of diverse ethnic musical
113

Adv. J. STEM. Ed.

characteristics. The development of more efficient parallel processing solutions would boost the
system's capability in handling extensive datasets. Cultural adaptation requires deeper
investigation into musical traditions along the Belt and Road regions, with improved cultural
similarity calculation models. In educational applications, research should explore automated
generation of personalized learning pathways and better integration of traditional teaching
methods with intelligent recommendation systems. Additionally, incorporating advanced deep
learning models and real-time feedback mechanisms would significantly enhance the overall
system performance and learning outcome assessment.

REFERENCES
Colafiglio, T., Ardito, C., Sorino, P., Lofù, D., Festa, F., Di Noia, T., & Di Sciascio, E. (2024). NeuralPMG:
A neural polyphonic music generation system based on machine learning algorithms.
Cognitive Computation, 16(5), 2779–2802.
Cui, X., & Chen, M. (2024). A novel learning framework for vocal music education: An exploration of
convolutional neural networks and pluralistic learning approaches. Soft Computing, 28(4),
3533–3553.
Dongfang, W. (2023). Application of machine learning technology in classical music education.
International Journal of Web-Based Learning and Teaching Technologies, 18(2), 1–15.
Fang, J. (2025). Artificial intelligence robots based on machine learning and visual algorithms for
interactive experience assistance in music classrooms. Entertainment Computing, 52,
100779.
Hou, D. (2023). Music photonic signal analysis based health monitoring system using classification
by quantum machine learning techniques. Optical and Quantum Electronics, 56(3).
Hou, D. (2024). Retraction note: Music photonic signal analysis based health monitoring system
using classification by quantum machine learning techniques. Optical and Quantum
Electronics, 56(12), 1953.
Mobile Computing Wireless Communications and. (2023). Retracted: Application of internet of
things technology in vocal music teaching recording equipment assisted by machine learning.
Wireless Communications and Mobile Computing, 2023.
Tabrez, Q. M., Alkhammash, E. H., Khan, M. A., & Hadjouni, M. (2022). Retraction note: Emotionbased music recommendation and classification using machine learning with IoT framework.
Soft Computing, 27(5), 2755.
Waghmare, K. C., & Sonkamble, B. A. (2020). Machine learning algorithms for Indian music
classification based on raga framework. International Journal of Innovative Technology and
Exploring Engineering, 9(11), 130–134.
White, C. W. (2025). The AI music problem: Why machine learning conflicts with musical creativity.
https://doi.org/10.4324/9781003587415
Xiao, H. (2024). Machine learning and music analysis: A new method for automated recognition of
music style and emotions. International Journal of High Speed Electronics and Systems.
(Prepublish).
Yin, J. (2024). Impact of music teaching on student mental health using IoT, recurrent neural
networks, and big data analytics. Mobile Networks and Applications, (prepublish), 1–20.
Zhang, J., Yu, S., Liu, R., Xie, G. X., & Zurawicki, L. (2024). Unveiling the melodic matrix: Exploring
genre-and-audio dynamics in the digital music popularity using machine learning techniques.
Marketing Intelligence & Planning, 42(8), 1333–1352.
Zhang, T., Liu, X., Guo, Z., & Tian, Y. (2024). Adaptive music recommendation: Applying machine
learning algorithms using low computing device. Journal of Software Engineering and
Applications, 17(11), 817–831.
114

Adv. J. STEM. Ed.

Zhu, J. (2024). Quantum photonics based health monitoring system using music data analysis by
machine learning models. Optical and Quantum Electronics, 56(4).
Zhu, J. (2024). Retraction note: Quantum photonics based health monitoring system using music
data analysis by machine learning models. Optical and Quantum Electronics, 56(12), 1981.

115