Jurnal Telematika vol.
20 no.
e-ISSN: 2579-3772 doi: 10.
61769/telematika.
Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features Angelina Pramana Thenata1.
Ranny2.
Bhustomy Hakim3.
Fergie Joanda Kaunang4,* 1,2,3,4 Faculty of Design and Technology.
Universitas Bunda Mulia Jl.
Lodan Raya No.
Ancol.
Jakarta.
Indonesia 1athenata@bundamulia.
fransiska@gmail.
3bhakim@bundamulia.
*Correspondence: fkaunang@bundamulia.
_____________________________________________________________________________________
AbstractAi This study uses a supervised learning approach based on neural networks for anomaly detection in industrial fan systems.
Using a subset of the FAN data from the MIMII .
alfunctioning industrial machine investigation and inspectio.
dataset with 530 labelled recordings .
normal and 147 abnorma.
, this study extracts acoustic features including mel-frequency cepstral coefficients (MFCC), spectral descriptors .
entroid, roll of.
, and temporal measures .
ero-crossing rate, autocorrelatio.
Univariate statistical tests reveal that several MFCC coefficients and time-domain features differ significantly between classes .
< 0.
A feedforward neural network model with two hidden layers of 64 units (ReLU activatio.
and dropout regularisation was trained using stratified cross-validation with 5-fold, resulting in an average F1 score of 89.
The use of several threshold values (E OO .
3Ae0.
) confirmed the robustness of the model, as seen in the test data results with the selected threshold value of E = 0.
5, which achieved a precision of 100%, recall = 93.
F1 = 96.
and accuracy = 98.
11% .
dentical results were obtained at E = 0.
6Ae0.
while E = 0.
3 provided higher recal.
The model also produced an AUC-ROC value of 0.
9978, which is close to ideal and demonstrates excellent cross-threshold discrimination.
These findings demonstrate that combining interpretable acoustic features with a compact neural classifier enables accurate non-invasive anomaly detection for Industry 4.
0 applications with minimal hardware requirements.
KeywordsAi industrial fan, anomaly detection, acoustic features, neural network, supervised learning AbstractAi Penelitian ini menggunakan pendekatan supervised learning berbasis jaringan saraf untuk deteksi anomali pada sistem kipas industry.
Dengan subset data FAN dari MIMII .
alfunctioning industrial machine investigation and inspectio.
dataset dengan 530 rekaman berlabel .
normal dan 147 abnorma.
, penelitian ini mengekstraksi fitur akustik yang meliputi mel-frequency cepstral coefficients (MFCC), spectral descriptor .
entroid, roll of.
, serta temporal measures .
ero-crossing rate, autocorrelatio.
Uji statistik univariat menunjukkan sejumlah koefisien MFCC dan fitur domain waktu berbeda signifikan antar kelas .
< 0,.
Model jaringan saraf feed-forward dengan dua lapisan tersembunyi berukuran 64 unit .
ktivasi ReLU) dan regularisasi dropout dilatih menggunakan stratified cross validation dengan 5-fold sehingga menghasilkan nilai F1 rata-rata sebesar 89,9%.
Penggunaan beberapa nilai ambang (E OO .
,3Ae0,.
) menegaskan kekokohan model yang terlihat pada hasil data uji dengan nilai ambang terpilih adalah E = 0,5 yang mencapai precision sebesar 100%, recall = 93,10%.
F1 = 96,43%, dan akurasi = 98,11% .
asil identik diperoleh pada E = 0,6Ae0,7.
sementara E = 0,3 memberikan recall lebih tingg.
Model juga menghasilkan nilai AUC-ROC sebesar 0,9978 yang mendekati ideal dan menunjukkan daya diskriminasi lintas-ambang yang sangat baik.
Temuan ini memperlihatkan bahwa penggabungan fitur akustik yang dapat diinterpretasikan dengan pengklasifikasi saraf yang ringkas Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features memungkinkan deteksi anomali non-invasif yang akurat untuk penerapan Industri 4.
0 dengan kebutuhan perangkat keras minimal.
Kata Kunci Ai kipas mesin industry, deteksi anomali suara, fitur akustik, jaringan saraf tiruan, supervised ____________________________________________________________________________________ INTRODUCTION Industrial fans are essential components in numerous manufacturing processes and industrial facilities, serving critical functions such as ventilation, cooling, airflow regulation, and environmental temperature Any operational malfunction in these fans can trigger cascading failuresAiresulting in production inefficiencies, increased maintenance costs, potential equipment damage, and safety hazards .
Consequently, reliable condition monitoring systems and proactive maintenance strategies are crucial for ensuring safe, efficient, and uninterrupted industrial operations.
Traditionally, the health of industrial machinery is monitored through manual inspection routines or by deploying vibration and temperature Skilled technicians rely on their sensory perceptionsAisuch as vision, hearing, smell, and touchAi combined with professional expertise to identify irregularities.
While these conventional approaches offer adaptability and immediate interpretation, they are inherently limited by ambiguous diagnostic standards, inconsistencies in human judgment, and challenges in scaling workforce availability to balance operational demands and costs .
, .
With the increasing adoption of artificial intelligence (AI), especially machine learning (ML), datadriven methods have emerged as powerful tools to automate fault detection and classification.
In parallel with these advancements, the growing adoption of artificial intelligence (AI), particularly in the domain of machine learning (ML), has accelerated the transition from manual inspection to automated anomaly detection and classification systems.
Machine learning techniques have been widely applied across diverse domains, including web-based applications for recognizing handwritten Javanese script using convolutional neural networks (CNN.
, music genre clustering through unsupervised learning .
, and financial forecasting using hybrid models such as temporal convolutional networks and generative adversarial networks (TCN-GAN) .
These innovations demonstrate the flexibility and robustness of ML in capturing complex patterns across various types of data.
In the context of industrial maintenance.
ML algorithms are capable of learning normal operational behaviour from historical sensor data and identifying subtle deviations that may indicate potential faults.
Specifically, in acoustic monitoring applications, machine-generated sound signals can be digitally analyzed to capture imperceptible variations that may escape human detection.
This capability allows ML-based diagnostic systems to achieve consistent, repeatable, and scalable monitoring performance that is not limited by human subjectivity .
These approaches learn patterns of normal machine behavior from historical data and can detect subtle deviations indicative of potential anomalies.
Among the various sensing modalities, acoustic-based monitoring stands out due to its non-invasive nature, cost-effectiveness, and capacity to capture early mechanical faults through sound signatures before they manifest physically .
Machine learning models are well-suited for interpreting audio data by extracting representative features such as melfrequency cepstral coefficients (MFCC.
, spectral centroid, spectral roll-off, and zero-crossing rate.
These features encapsulate important temporal and frequency domain information from audio signals, enabling the training of classifiers that can discriminate between normal and faulty machine states .
, .
While classical machine learning algorithms such as support vector machines (SVM.
and random forests have demonstrated success in this domain .
, .
, deep learning models like convolutional neural networks (CNN.
and autoencoders have shown superior performance due to their ability to learn complex.
Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features nonlinear patterns from high-dimensional data.
Recent studies have explored audio-based anomaly detection across a range of industrial machines.
For example, .
introduces a novel method for sound-based anomaly detection aimed at overcoming the limitations of traditional, costly sensor-based systems and the challenge of industrial noise.
Likewise, .
proposes an intelligent fault diagnosis system for hydraulic piston pumps using a novel PSO-LeNet model.
Applied to acoustic signals, the model successfully identifies five common pump faults.
Comparative analysis against several deeper CNNs, including AlexNet and VGG models, showed that the PSO-LeNet achieved superior stability and the highest accuracy, demonstrating its excellent overall performance.
Although these unsupervised approaches are advantageous in settings with limited labeled data, they frequently suffer from high false-positive rates due to their assumption that all deviations from the learned normal pattern signify anomalies.
In contrast, supervised learning approaches are capable of more precise and robust classification when labeled data for both normal and anomalous classes is available .
Despite the abundance of studies focusing on pumps .
, .
, .
, valves .
, .
, and motors .
, .
limited research has specifically addressed anomaly classification in industrial fan systems, which often have unique acoustic characteristics and operate in environments with high background noise.
Additionally, fan faults such as bearing wear, imbalance, and misalignment, can manifest subtly in audio signals, posing a challenge for generalized models.
To address this gap, the present study proposes a supervised neural network model for the binary classification of industrial fan sounds using engineered acoustic features.
This approach integrates domain expertise in audio signal processing with the learning capabilities of neural networks to effectively distinguish between normal and abnormal fan operating The study leverages the publicly available malfunctioning industrial machine investigation and inspection (MIMII) dataset .
, which contains labeled recordings of machine sounds under real-world factory conditions, including controlled faults and normal operations.
An accurate and lightweight acoustic classifier has significant implications for industry 4.
0 and predictive maintenance frameworks, enabling scalable, real-time monitoring solutions without the need for expensive or invasive sensor systems.
The model presented in this research aims to contribute toward the development of intelligent diagnostic systems suitable for integration into modern industrial infrastructure.
The remainder of this paper is organized as follows: Section II describes the dataset, feature extraction, and proposed methodology.
Section i presents the experimental setup and results.
and Section IV concludes the study and outlines future work directions.
II.
METHODOLOGY
Fig.
1 shows the flow of this study which employed a supervised learning approach to classify industrial fan sound anomalies using a neural network model.
The methodology consist of six stages: data collection, data preprocessing, exploratory data analysis, feature extraction, model development, and .
Data Collection: This research begins with the acquisition of industrial machine sound data, sourced from the MIMII .
alfunctioning industrial machine investigation and inspectio.
This dataset specifically contains audio recordings from industrial fan machines under two primary operating conditions:
normal and anomalous, where the latter is caused by mechanical faults or malfunctions.
In this study, a total of 530 audio samples were utilized, consisting of 383 normal and 147 abnormal recordings.
Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features Fig.
1 Research flow .
Data Preprocessing: Audio samples were sourced from the MIMII dataset, comprising 530 recordings .
normal, 147 abnorma.
wav format.
Each file was loaded using the Librosa library, extracting the time-series waveform and sampling rate.
Binary labels .
for normal, 1 for abnorma.
were assigned to facilitate supervised learning.
Exploratory Data Analysis (EDA): An exploratory data analysis (EDA) was performed to characterize normal vs.
abnormal fan acoustics and assess the discriminative value of the extracted descriptors.
Distributional analyses .
istograms and kernel density estimate.
revealed clear shifts for most melfrequency cepstral coefficients (MFCC.
, spectral centroid, spectral roll-off, and zero-crossing rate between classes, indicating potential separability.
A correlation heatmap is used to show strong intra-correlation among features.
Descriptive statistics .
ean and standard deviatio.
were computed per class, and independent two-sample t-tests were applied feature-wise to evaluate the statistical significance of mean .
Feature Extraction: To convert raw audio signals into machine-interpretable representations, we extracted a comprehensive set of engineered acoustic features.
mel-frequency cepstral coefficients (MFCC.
were computed as 13-dimensional vectors capturing spectral envelope characteristics, with mean values calculated across time frames to represent timbral properties.
Spectral features included three key descriptors: spectral centroid .
uantifying signal brightnes.
, spectral roll-off .
easuring frequency sprea.
, and spectral contrast .
ncoding peak-valley dynamics in the frequency spectru.
Temporal characteristics were represented through zero-crossing rate (ZCR), which indicates signal noisiness, and autocorrelation for periodicity analysis.
All features were normalized and concatenated into a unified feature vector for each audio sample, forming the input space for subsequent classification tasks.
This multi-domain feature ensemble effectively encapsulated both spectral and temporal patterns indicative of fan operational states.
Model Development: The classification model was implemented as a feedforward neural network using TensorFlow Keras.
The architecture consisted of an input layer matching the feature dimensions, followed by two dense hidden layers with 64 neurons each, employing ReLU activation functions to introduce nonlinearity.
Dropout layers .
= 0.
were incorporated between hidden layers to prevent overfitting through regularization.
The output layer utilized a single neuron with sigmoid activation for binary classification.
The model was trained using the Adam optimizer with binary cross-entropy loss function, selected for its effectiveness in probabilistic classification tasks.
To ensure robust evaluation, we employed stratified 5-fold cross-validation, maintaining class distribution across folds.
Training proceeded for 100 epochs with a batch size of 32, with early stopping monitored to prevent overfitting while ensuring Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features .
Model Evaluation: The proposed classification model was evaluated through a multi-stage process to ensure reliable and unbiased performance assessment.
Initially, stratified 5-fold cross-validation was employed to address class imbalance and evaluate generalization.
In each fold, the model was trained, validated, and its performance measured using the F1 score, a balanced metric well-suited for imbalanced binary classification.
The average F1 score across folds provided a robust estimate of the modelAos Subsequently, final testing was performed on a separate, unseen test set to assess real-world Probabilistic outputs were evaluated across multiple thresholds .
3 to 0.
, and metrics including Accuracy.
Precision.
Recall, and F1 Score were computed.
This multi-threshold analysis enabled examination of the model's behaviour under varying decision boundaries, crucial for applications where false positives or false negatives carry different costs.
Finally, the area under the receiver operating characteristic curve (AUC-ROC) was calculated to measure threshold-independent discriminatory ability.
A high AUC-ROC value confirmed the model's strong capability in distinguishing between normal and abnormal instances across all decision thresholds.
RESULT AND DISCUSSION
Feature Analysis The evaluation of the proposed classification model was conducted in a multi-stage process to ensure a thorough and unbiased assessment of its performance.
Initially, the model was validated using stratified 5-fold cross-validation.
To gain deeper insight into the discriminative potential of extracted acoustic features, a series of visualizations were generated to compare the distributions between normal and abnormal machine sound samples, as illustrated in Fig.
The figure presents histograms and kernel density estimations for 13 mel-frequency cepstral coefficients (MFCC.
, along with key spectral and temporal features including spectral centroid, spectral roll-off, spectral contrast, zero crossing rate, and autocorrela- Fig.
2 Distribution plots of MFCCs and other acoustic features for normal .
and abnormal .
classes Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features Each feature was plotted for both the normal class .
n blu.
and the abnormal class .
n re.
to visually assess the separation between categories.
The MFCC features, which are widely used to capture perceptual aspects of sound, demonstrate varying levels of separability.
Specifically.
MFCC_1 to MFCC_6 show clear differences in distribution shape and central tendency between normal and abnormal samples, suggesting strong discriminative power.
In contrast.
MFCC_11 to MFCC_13 exhibit considerable overlap, indicating limited contribution to classification performance.
These results are consistent with statistical t-test outcomes, where only MFCCs with significant p-values .
< 0.
were deemed informative.
Spectral centroid and spectral roll-off reveal substantial distributional shifts between the two classes.
Abnormal sounds generally exhibit lower centroid and roll-off values, indicating energy concentration in lower frequency bands, which may result from mechanical degradation or frictional anomalies.
Spectral contrast, however, presents overlapping distributions for both classes, aligning with its non-significant statistical difference and limited relevance for the classification task.
The zero crossing rate demonstrates distinct behaviour for abnormal samples, showing a higher frequency of sign changes, which can be attributed to increased randomness or mechanical instability.
Autocorrelation also displays a notable difference in distribution peaks, reflecting changes in signal periodicity often caused by anomalies.
To further understand the relationships among the extracted acoustic features, a Pearson correlation heatmap was constructed, as shown in Fig.
We quantify pairwise linear associations using the Pearson correlation coefficient yc:
cu, y.
= Ocycu ycn=1.
cuycn OeycuI ).
cycn Oeyc
I)2
ycn=1.
cuycn OeycuI ) ocycn=1.
cycn Oeyc This analysis provides valuable insights into the redundancy, complementarity, and potential multicollinearity between features, which are critical considerations in the development of a robust classification model.
The results indicate that the spectral centroid shows a strong positive correlation with MFCC_1 .
= 0.
, which is consistent with the interpretation of MFCC_1 as representing the overall energy or brightness of a sound signal, properties that are also encapsulated by the spectral centroid.
Fig.
3 Pearson correlation heatmap of extracted acoustic features.
Blue indicates negative correlation, red indicates positive correlation, with values ranging from Ae1 to 1.
Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features Additionally, the spectral centroid demonstrates moderate positive correlations with MFCC_4 .
= .
MFCC_5 .
= 0.
MFCC_6 .
= 0.
, and MFCC_8 .
= 0.
, suggesting shared information content across these frequency-related features.
In contrast, the spectral roll-off exhibits a strong negative correlation with both MFCC_1 .
= Ae0.
and the spectral centroid itself .
= Ae0.
This inverse relationship is expected, as signals with higher frequency content .
, higher centroid value.
tend to have their spectral energy more evenly distributed, resulting in a higher roll-off threshold.
Furthermore, spectral roll-off displays moderate negative correlations with other MFCCs, reinforcing its inverse association with frequency-distributed energy characteristics.
The spectral contrast demonstrates generally weak correlations with most MFCCs and other spectral features, suggesting that it captures complementary information related to the variations between spectral peaks and valleys.
Notably, it shows a moderate positive correlation with MFCC_4 .
= 0.
and a moderate negative correlation with MFCC_1 .
= Ae0.
, highlighting its unique contribution to the feature space.
Regarding temporal features, the zero crossing rate (ZCR) shows strong positive correlations with MFCC_1 .
= 0.
, spectral centroid .
= 0.
, and spectral roll-off .
= 0.
These results are consistent with the idea that audio signals with greater high-frequency energy content tend to have more frequent zero crossings.
Therefore.
ZCR acts as an effective indicator of spectral sharpness or The autocorrelation feature, which captures periodicity in the time domain, presents moderate negative correlations with MFCC_1 .
= Ae0.
, spectral centroid .
= Ae0.
, and spectral roll-off .
= Ae0.
This suggests that signals with more harmonic or regular structures .
, higher autocorrelation value.
are likely associated with lower-frequency energy distributions.
Additionally, autocorrelation shows moderate positive correlations with MFCC_5 .
= 0.
and MFCC_6 .
= 0.
, indicating potential relationships with mid-frequency cepstral dynamics.
The statistical analysis of extracted acoustic features, comprising mel-frequency cepstral coefficients (MFCC.
, spectral, and temporal features, reveals meaningful distinctions between normal and abnormal sound classes as shown in Table I.
These differences provide quantitative evidence that supports the suitability of these features for anomaly classification tasks.
For the MFCC features, which characterize the spectral envelope of audio signals, significant variations in mean values were observed between the two MFCC_1, which typically represents overall signal energy, shows a lower mean in normal samples (Oe369.
compared to abnormal samples (Oe360.
, suggesting higher average energy or brightness in anomalous signals.
Similarly.
MFCC_2 and MFCC_3 display slightly elevated means in abnormal samples, indicating subtle shifts in spectral shape.
The most pronounced difference occurs in MFCC_5, where the mean in the abnormal class (Oe1.
is considerably lower than in the normal class (Oe0.
, with a notably higher standard deviation .
22 vs.
, indicating greater variability in anomalous signals.
MFCCs 6 through 10 also exhibit small yet consistent differences in means, generally trending higher for abnormal sounds, reinforcing the hypothesis of spectral distortion due to mechanical anomalies.
MFCC_9 and MFCC_10, for instance, both show higher means in the abnormal class .
04 vs.
07 and 61 vs.
18, respectivel.
, along with increased standard deviations, pointing to instability in the spectral structure under faulty conditions.
Although MFCC_11 to MFCC_13 show more modest variations, their distributions still reflect subtle class-dependent differences.
Spectral features further highlight differences in frequency characteristics.
The spectral centroid, which indicates the center of mass of the frequency spectrum, is higher in normal sounds .
97 H.
than in abnormal ones .
60 H.
, suggesting that normal sounds tend to be sharper or contain more high-frequency components.
A similar pattern is observed in the spectral roll-off, which defines the frequency below which 85% of the signal energy is contained.
Normal samples have a higher roll-off mean .
33 H.
than abnormal samples .
77 H.
, supporting the notion of broader spectral content in normal operation.
The spectral contrast, which measures the difference between spectral peaks and valleys, shows a slightly higher mean for abnormal sounds .
98 vs.
Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features .
, along with a much larger standard deviation .
36 vs.
, indicating irregularities and greater variation in spectral dynamics under fault conditions.
Temporal features contribute valuable insights for anomaly classification.
The zero crossing rate (ZCR) is slightly higher in normal sounds .
compared to abnormal ones .
, with greater variability observed in the abnormal class.
This suggests more consistent high-frequency content in normal signals and increased temporal irregularity in abnormal ones.
While autocorrelation yields near-zero mean and standard deviation, statistical testing revealed it to be highly significant .
OO 3.
95eOe.
, highlighting its utility in capturing differences in signal periodicity.
Independent t-tests confirmed that several MFCCs (MFCC_1, 3, 5, 8, 9, 11, 12, .
significantly differ between normal and abnormal sounds .
< 0.
, indicating strong discriminative potential.
In contrast.
MFCC_2, 4, 6, 7, and 10 did not exhibit significant differences.
Among spectral features, only spectral contrast showed statistical significance .
= 0.
, while spectral centroid and roll-off did not, likely due to higher variance in the abnormal data.
The model's performance was evaluated using stratified 5-fold cross-validation, yielding F1 scores between 0.
8929 and 0.
9123, with an average of 0.
This narrow range demonstrates the modelAos robustness and stability across data splits.
An average F1 score near 0.
90 confirms a well-balanced precision-recall trade-off, making the model reliable for real-world anomaly detection in industrial fan systems.
TABLE I
MEAN AND STANDARD DEVIATION OF EXTRACTED FEATURES FOR NORMAL AND ABNORMAL CLASSES
Feature Normal Mean
Normal Std Dev
Abnormal Mean
Abnormal Std Dev
MFCC_1
MFCC_2
MFCC_3
MFCC_4
MFCC_5
MFCC_6
MFCC_7
MFCC_8
MFCC_9
MFCC_10
MFCC_11
MFCC_12
MFCC_13
Spectral centroid Spectral roll-off Spectral contrast Zero crossing rate Autocorrelation Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features Fig.
4 illustrates the training process of the proposed neural network model over 100 epochs, showing the progression of accuracy and loss.
The left plot depicts the model's accuracy, which demonstrates a consistent upward trend, rising from approximately 0.
64 to above 0.
This steady increase indicates effective learning during the training phase, with the model progressively capturing discriminative patterns in the feature space.
The right plot displays the corresponding training loss, which exhibits a sharp decline from an initial value of around 0.
65 to below 0.
1, stabilizing after approximately 80 epochs.
The continual reduction in loss values, without abrupt oscillations or divergence, confirms that the learning rate and model complexity were appropriately configured, avoiding both underfitting and overfitting within the training The combination of increasing accuracy and decreasing loss suggests successful convergence of the training process.
Moreover, the relatively smooth learning curves imply that the model generalizes well within the training set.
To further validate generalizability, performance on a separate test set and crossvalidation scores are reported in the following sections.
In binary classification for anomaly detection, the default probability cutoff (E = 0.
may not be We therefore evaluate the classifier across multiple thresholds (E OO .
3, 0.
4, 0.
5, 0.
6, 0.
) to characterize precisionAerecall trade-offs.
As summarized in Table II, accuracy is consistently high and peaks 11% for E Ou 0.
At E = 0.
3, the model attains higher recall = 96.
55%Aicapturing more true anomaliesAi at the cost of lower precision = 93.
33% (F1 = 94.
92%, accuracy 97.
17%), reflecting a small increase in false At E = 0.
4, precision = 96.
43% and recall = 93.
10% (F1 = 94.
74%, accuracy 97.
17%), offering a slightly more conservative operating point.
For E OO .
5, 0.
6, 0.
, precision reaches 100% .
o false positives Fig.
4 Training accuracy and loss curves of the proposed neural network model over 100 epochs.
TABLE II
PERFORMANCE METRICS AT VARIOUS CLASSIFICATION THRESHOLDS
Threshold Accuracy Precision Recall F1 Score Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features in the test se.
with recall = 93.
10%, yielding the highest F1 = 96.
43% and accuracy = 98.
the identical metrics across these thresholds indicate that no predicted probabilities lie in .
5, 0.
, hence the same class Considering this balance and the elimination of false alarms, we adopt E = 0.
5 as the primary operating point with TP=27.
FP=0.
FN=2.
TN=77 as shown in Fig.
Finally, the modelAos AUC-ROC = 9978, indicating excellent threshold-independent discriminative ability.
IV.
CONCLUSIONS
This study presented a supervised neural-network classifier for acoustic anomaly detection in industrial fan systems using engineered features.
Exploratory and statistical analyses showed that MFCC_1 to MFCC_6, spectral centroid, spectral roll-off, and autocorrelation provide the strongest discrimination between normal and abnormal conditions .
< 0.
, whereas MFCC_11 to MFCC_13 and spectral contrast contribute limited additional value.
A compact feed-forward network .
wo 64-unit ReLU layers with dropou.
trained with stratified 5-fold cross-validation achieved a mean F1 of 89.
Threshold sweeping (E OO .
3, 0.
4, 0.
5, 0.
6, 0.
) on the held-out test set confirmed robust behavior.
the selected operating point E = 0.
5 attains precision = 100%, recall = 93.
F1 = 96.
43%, and accuracy = 98.
11% .
dentical metrics for E = 0.
6Ae0.
, while E = 0.
3 yields higher recall at the expense of precision.
The modelAos AUC-ROC = 9978 indicates near-perfect threshold-independent separability.
These findings demonstrate that combining interpretable acoustic features with a lightweight neural classifier enables accurate, non-invasive anomaly detection suitable for predictive-maintenance workflows with minimal hardware overhead.
This study contributes to the advancement of predictive maintenance frameworks, providing a cost-effective and non-invasive alternative to traditional sensor-based monitoring.
Future work could explore the generalization of this approach to other industrial machinery and the incorporation of unsupervised learning techniques to further enhance anomaly detection in scenarios with limited labeled data.
Fig.
5 Confusion matrix on the test set at the selected operating threshold E=0.
5\tau = 0.
5E=0.
ositive class:
Abnorma.
Supervised Classification of Industrial Fan Sound Anomalies Using Neural Networks and Engineered Acoustic Features REFERENCES