Journal of the Civil Engineering Forum.
May 2026, 12.
:229-241 DOI 10.
22146/jcef.
Available Online at https://jurnal.
id/v3/jcef/issue/archive Data-Driven and Physics-Informed Neural Networks for Structural Health Monitoring of the Z24 Bridge Abdellah Riyahi* .
Mohammed Mestari.
Bouchra Bouihi 2IACS Laboratory.
ENSET Mohammedia.
Hassan II University of Casablanca.
Mohammedia.
MOROCCO *Corresponding author: a.
riyahi@enset-media.
SUBMITTED 28 August 2025 REVISED 16 November 2025 ACCEPTED 28 December 2025 ABSTRACT Structural Health Monitoring (SHM) is crucial for maintaining the sustainability and safety of civil infrastructure.
The Z24 Bridge in Switzerland remains one of the benchmark datasets used to validate vibration-based damage detection methods.
Traditional approaches based exclusively on modal parameters are frequently limited by data scarcity and environmental variability.
Recent advances in artificial intelligence have enabled data-driven neural networks to learn discriminative features directly from raw Meanwhile, hybrid methods such as Physics-Informed Neural Networks (PINN.
incorporate governing physical laws into the learning process.
This study presents a comparative analysis of three successive artificial Neural Network models (NN V1AeV.
and One Physics-Informed Neural Network (PINN V.
, all applied to the Z24 Bridge dataset.
The NN models progressively improve in depth, optimization strategy, and regularization, achieving OO97.
7% validation accuracy and a macro AUC OO1.
00 with NN V3.
However, they remain completely dependent on the quality and quantity of training data.
In contrast, the PINN incorporates the differential equation of a damped oscillator into its loss function, balancing a data-driven term with a physics-based residual.
This approach enables more stable learning with limited labeled data and ensures consistency with structural dynamics.
Experimental results highlight the trade-off between accuracy and robustness: while NN V3 yields the highest predictive performance (OO97.
7% validation accuracy, macro AUC OO1.
PINN V1 achieves slightly lower accuracy (OO92%) but offers improved stability and interpretability.
This dual perspective demonstrates that hybrid physics-informed models provide a more reliable basis for decision-making in SHM.
The findings underscore the potential of combining machine learning with physical knowledge, paving the way for future developments such as hybrid PINNs (HPINN.
, multi-sensor integration, and high-performance computing deployment.
KEYWORDS Structural Health Monitoring.
Z24 Bridge.
Neural Networks.
Physics-Informed Neural Networks.
Vibration-Based Damage Detection.
Hybrid Modeling.
A The Author.
This article is distributed under a Creative Commons Attribution-ShareAlike 4.
0 International license.
1 INTRODUCTION
In modern civil engineering.
Structural Health Monitoring (SHM) has emerged as a crucial challenge.
Bridges and viaducts are major examples of aging infrastructure that are continuously subjected to traffic loads, environmental conditions, and material fatigue.
This context has led to the development of advanced methods that enable early damage detection, as ensuring user safety and optimizing preventive maintenance are top priorities (Alla and Asadi, 2.
The Z24 Bridge in Switzerland remains one of the most iconic case studies and a major benchmark.
This bridge generated a comprehensive dataset that has become an important reference for validating vibration-based SHM approaches, as it was instrumented and subjected to controlled damage scenarios prior to its demolition (Maeck and De Roeck, 2.
When artificial neural networks are applied to SHM, they demonstrate significant potential for state classification, damage detection, and the prediction of nonlinear behaviors.
Recent studies underline the urgency of reliable bridge monitoring and model credibility in civil infrastructure, reinforcing the relevance of this study (Seventekidis and Giagopoulos, 2021.
Absor et al.
, 2.
Traditionally.
SHM methods have relied on modal analysis techniques, including mode shapes, damping ratios, and natural frequency extraction.
Although robust, these methods have limitations under real-world conditions due to measurement noise, environmen- However, data-driven approaches depend entirely on the quality and availability of training datasets.
Failure data are scarce in many cases, creating risks of overfitting and limited generalization capacity (Zhang et al.
, 2.
To overcome these limitations, hybrid ap- tal variability, and the difficulty of explicitly correlating modal changes with structural damage (Avci et al.
Consequently, these limitations have paved the way for the rise of artificial intelligence (AI), particularly neural networks, which can automatically extract discriminative features from either preprocessed or raw signals (Plevris and Papazafeiropoulos, 2.
Journal of the Civil Engineering Forum proaches have emerged that integrate governing physical laws into the learning process, namely PhysicsInformed Neural Networks (PINN.
(Raissi et al.
, 2019.
Pawar et al.
, 2021.
Oddiraju et al.
, 2.
By combining the rigor of partial differential equations that describe structural dynamics with the predictive power of machine learning.
PINNs represent a significant breakthrough.
Unlike classical neural networks, whose loss functions depend solely on the error between predictions and observations.
PINNs incorporate additional terms that enforce governing motion equations.
This integration of physical constraints improves stability, reduces reliance on labeled data, and enhances interpretability (Karniadakis et al.
This approach is especially promising in SHM because structural dynamic equations .
elocities, displacements, and acceleration.
are directly incorporated into the learning process (Absor et al.
, 2.
Recent advances in vibration-based SHM combine fast data-driven classifiers with physics-guided learning.
Compact time-series methods .
, convolutional or transformer-based approaches such as MiniROCKET) have demonstrated strong accuracy and efficiency on sensor streams, while PINNs have progressed from proof-of-concept studies to applications in structural dynamics, where enforcing governing equations improves interpretability and out-of-distribution robustness (Dempster et al.
, 2020.
Karniadakis et al.
, 2021.
Sun et al.
, 2.
In parallel, recent civil engineering studies emphasize the urgency of reliable bridge monitoring and highlight the need to evaluate not only predictive accuracy but also operational robustness and physical credibility (Purba et al.
, 2023.
Absor et al.
Beyond single-degree-of-freedom (SDOF) settings, recent studies have extended PINNs to multi-degreeof-freedom (MDOF) structural dynamics, combining physics constraints with parametric or state inference based on multi-modal responses.
These works report that physics-guided regularization can reduce overfitting and improve robustness under sparse labels, albeit at higher computational cost.
Our study is complementary.
we deliberately analyze the SDOF-regularized case on Z24 to isolate the benefits and limitations of a first-mode prior and to motivate future MDOF PINN developments (Felahi, 2.
This study addresses that gap by:
A Conducting a systematic comparison between three lightweight data-driven MLP variants (NN V1AeV.
and a physics-informed model (PINN V.
on the same Z24 dataset.
A Enforcing identical data handling and a fixed train/validation/test split across models to ensure fair comparability.
Vol.
12 No.
2 (May 2.
A Reporting multi-criteria metrics .
AUC, precision/recall, per-class confusio.
together with physics diagnostics .
, residuals of the governing equation, power spectral densit.
and discussing the practical trade-offs between data-rich performance and physics-based interpretability.
Therefore, we study in a controlled and reproducible setting: .
the progressive design choices in NN V1Ae V3 .
epth, optimization, and regularizatio.
a PINN that explicitly integrates the motion law of a damped oscillator as a simplified prior for bridge dynamics, thereby coupling data-driven learning with physical constraints.
The detailed quantitative findings and comparisons are presented in Sections 4Ae5.
The remainder of this paper is organized as follows:
Section 1 reviews related work in SHM and PINNs.
Section 2 describes the Z24 dataset.
Section 3 details the developed models (NN and PINN) and the training Section 4 and 5 presents and discusses the experimental results.
and Section 6 concludes with perspectives for future research.
We evaluate three purely data-driven classifiersAiNN V1.
NN V2, and NN V3Aiwhich represent successive improvements in depth, optimization, and regularization, alongside a physics-informed model (PINN V.
that embeds a damped-oscillator law consistent with bridge dynamics.
The architectural and training details are provided in Section 3.
A consolidated comparison of validation metrics and physics diagnostics is reported in Table 4 (Section .
, where we discuss accuracy, macro-F1, macro AUC, false negatives, and PINN residual/consistency indicators.
All models are trained and evaluated on windows extracted from the same vertical accelerometer .
cc_ .
to ensure like-for-like comparability across NN and PINN.
2 CASE STUDY: THE Z24 BRIDGE DATASET
One of the most studied experimental benchmarks in Structural Health Monitoring (SHM) is the Z24 Bridge in Switzerland.
This post-tensioned concrete bridge, located near Koppigen on the Swiss highway A1, was 30 m long and 11.
8 m wide, with three spans supported by two piers and abutments at each end.
The bridge was extensively instrumented by the Swiss Federal Institute of Technology (ETH Zyric.
and subjected to a sequence of controlled damage scenarios prior to its planned demolition in 1998, making it a unique largescale testbed for SHM research (Maeck and De Roeck.
Koenders and Papagiannakis, 2.
During the monitoring campaign, which lasted over one year, the bridge was equipped with various sen- Vol.
12 No.
2 (May 2.
Journal of the Civil Engineering Forum sors, including temperature sensors, strain gauges, accelerometers, and displacement transducers.
Measurements were collected under artificially induced damage states such as tendon rupture, pier settlement, and concrete cracking, as well as under varying operational and environmental conditions.
This produced a wide range of structural responses, from healthy to progressively damaged states, making the dataset particularly valuable for evaluating machine learningAebased damage detection algorithms (Avci et al.
, 2021.
Civera and Surace, 2.
The resulting dataset, distributed in MATLAB (.
format, is commonly referred to as the Z24 EMS dataset (Environmental and Modal Surve.
for reproducibility.
The EMS manifest ID used in this study is It is organized into several configurations, each corresponding to a distinct test condition.
Our research focuses on vibration response signals arranged into 17 different setups, each containing acceleration time series captured under intact or damaged structural states.
The dataset includes both ambient vibration measurements and controlled excitations, enabling the evaluation of supervised and unsupervised learning frameworks (Avci et al.
, 2021.
Eltouny et al.
, 2.
The Z24 datasetAos combination of progressive artificial damage introduction and long-term monitoring data is a key feature.
This dual nature allows researchers to address two complementary SHM challenges: .
determining the degree of damage in specific scenarios .
, pier settlemen.
distinguishing natural fluctuations .
, due to temperature and humidity change.
from actual structural deterioration.
Consequently.
Z24 has been widely adopted as a reference dataset in studies using deep learning.
Physics-Informed Neural Networks (PINN.
, and timeseries classification methods such as WaveNet and MiniRocket (Karniadakis et al.
, 2021.
Dabbous et al.
One objective of this work is to exploit the vibrationbased setups of the Z24 dataset to evaluate both conventional neural networks (NN V1AeV.
and hybrid physics-informed approaches (PINN V.
Each setup is represented by multiple .
mat files containing acceleration signals, with durations ranging from a few seconds to several minutes, depending on the testing configuration.
A typical file structure includes metadata such as excitation type and structural state, along with multichannel acceleration data sampled at high frequency.
The systematic organization of the dataset enables the creation of supervised classification tasks in which a structural state label is assigned to each input time series.
To evaluate both classification accuracy and physical interpretability, we selected representative subsets that balance healthy and damaged conditions.
As a re- Figure 1.
Front elevation .
and plan view .
of the Z24 Bridge.
Redrawn based on structural descriptions from the SIMCES project at KU Leuven ((KU Leuven, 2.
sult, the Z24 dataset serves both as a historical benchmark and as a valuable resource for contemporary AIdriven SHM frameworks.
An overview of the bridge geometry is provided in Figure 1.
A concise summary of the 17 setups .
onditions, duration, and file count.
is provided in Table 1.
The class mapping and representative files used in this study are listed in Table 2.
Table 1.
The description of the Z24 setups .
xample Setup ID Condition Type Healthy .
Environmental Damage A Severe Description .
amage/health.
Baseline, no damage Temperature/ humidity variation Duration .
# Files O120
O150
Pier settlement
O200
Progressive tendon failure O180 Table 2.
Summary of the dataset used.
Class Structural Condition Healthy .
Pier Tendon Concrete Combined # of Setups Example Files matA matA matA matA matA Notes Reference undamaged state Progressive Simulated cable rupture Surface Mixed This study focuses on the progressive damage subset of the Z24 dataset and uses a single vertical accelerometer channel .
cc_09, near mid-spa.
for all models to Journal of the Civil Engineering Forum ensure direct comparability.
While this choice simplifies the learning task and reduces variance introduced by sensor heterogeneity, it also limits the ability to capture spatially distributed effects.
This limitation is discussed later when assessing generalizability and operational applicability.
3 METHODOLOGY
1 Overview of the Modeling Approach In this study, the adopted methodology compares datadriven neural networks (NN V1.
NN V2.
NN V.
with a physics-informed neural network (PINN V.
applied to the Z24 Bridge dataset (Switzerlan.
Key hyperparameters for each model are summarized in Figure 2 and detailed in Table 3.
While the PINN enables the integration of governing equations of motion into the training process, the three NN models represent progressive improvements in architectural depth, optimization strategies, and regularization techniques.
This dual strategy provides insights into both numerical performance and physical interpretability (Alzubaidi et al.
, 2023.
Moradi et al.
, 2023.
Zhuang et al.
, 2.
Figure 2.
Overview of the data preprocessing and model development workflow for the Z24 Bridge dataset, including NN V1.
NN V2.
NN V3, and PINN V1.
The raw Z24 dataset, stored in MATLAB (.
files, contains acceleration responses and related metadata collected under various structural conditions.
systematic preprocessing pipeline was applied prior to training the neural network and physics-informed models to ensure consistency, comparability, and suitability for machine learning algorithms.
Signal selection: Only vertical acceleration signals from the most reliable sensors were retained, consistent with previous studies on the Z24 benchmark.
This choice preserves sensitivity to structural damage while reducing redundancy.
Sensor selection: A single vertical accelerometer .
ensor ID: acc_09, near mid-spa.
was retained, selected through a quantitative reliability screen232 Vol.
12 No.
2 (May 2.
ing based on: .
<1% missing samples across the healthy 01 class.
stable low-frequency PSD peaks within A5% across 01 setups.
signalto-noise ratio > 20 dB under ambient conditions.
The same sensor .
was used consistently for NN V1AeV3 and PINN V1 to ensure a fair comparison.
Normalization: Each signal was standardized using a z-score transformation .
ero mean, unit varianc.
This step improves the stability of gradientbased optimization and reduces scale disparities.
Unified preprocessing: To enable fair comparison, the same preprocessing pipeline was applied to all models: fixed window length and hop size, z-score normalization fitted on the training set only, and frozen label encoding.
The exact split indices were kept constant across NN V1AeV3 and PINN V1.
Segmentation: Long recordings were partitioned into fixed-length sequences .
indows of N = 65,536 samples, corresponding to several seconds of vibration dat.
Overlapping sliding windows were used to improve class balance and increase the number of training samples.
Labeling: Based on the progressive damage test, windows were labeled as Au01Ay for the healthy state and Au03Ae17Ay for increasing damage levels, resulting in a multi-class classification task.
Dataset split: Using stratified sampling to preserve class distributions, the dataset was divided into training, validation, and test subsets.
In our experiments, 70% of the windows were used for training, 15% for validation, and 15% for testing.
For reproducibility, splits were fixed using a random seed.
Caching and reproducibility: Preprocessed datasets were saved in compressed .
npz format and indexed by JSON manifests.
This ensured repeatability across experiments and efficient reloading for multiple models (NN V1AeV3.
PINN V.
The EMS manifest ID indexing the processed files and frozen split used throughout the experiments is 4b8b4d63c00f3a81.
By providing uniform and balanced inputs to all models, the preprocessing pipeline enabled an equitable comparison between data-driven and physicsinformed approaches (Cheng et al.
, 2021.
Yucesan et al.
, 2.
2 Neural Network Architectures (NN V1AeV.
1 NN V1: Baseline MLP The first architecture (NN V.
consists of a simple multilayer perceptron (MLP) with two hidden layers.
Rectified Linear Unit (ReLU) activation functions were used, and the Adam optimizer was employed.
Training was conducted for 100 epochs with early stopping to prevent overfitting.
This model serves as a baseline ref- Vol.
12 No.
2 (May 2.
Journal of the Civil Engineering Forum erence for subsequent developments (Bachmann et al.
2 NN V2: Deeper MLP with Regularization NN V2 introduces a significantly increased number of hidden layers .
and neurons per layer, improving the modelAos representational capacity.
To mitigate overfitting, dropout layers .
= 0.
and L2 regularization were added.
The training horizon was extended to 200 epochs, and learning rate decay was applied to improve convergence stability.
This version is expected to generalize better under noisy conditions (Park and Jo, 3 NN V3: Optimized MLP with Advanced Training where m, c, and k represent the equivalent mass, damping, and stiffness of the structural system, and F .
denotes external excitation.
Under ambient or operational excitation, the Z24 Bridge response is dominated by the first bending mode in the low-frequency range.
Therefore, it is standard to approximate the inputAeoutput dynamics using an equivalent damped SDOF model around the fundamental mode, capturing the primary stiffnessAedamping interaction while avoiding over-parameterization.
This assumption aligns with vibration-based SHM practice on Z24 and similar spans and is widely adopted in physics-guided identification when the first mode carries most of the energy.
We explicitly leverage this proxy to regularize learning and preserve interpretability in terms of modal quantities .
atural frequency and dampin.
(Felahi, 2.
NN V3 integrates optimization strategies such as batch normalization, adaptive learning rate schedulers, and larger batch sizes.
The architecture consists of six layers with progressively decreasing neuron counts .
512 Ie 256 Ie .
Training was conducted for 250 epochs using warm restarts in the learning rate schedule.
Compared to V1 and V2, this model demonstrates significantly improved convergence and robustness (Liu et al.
, 2.
The neural network is trained to approximate the displacement response u.
Automatic differentiation (AD) is used to compute uN.
and uO.
with respect to normalized time, ensuring smooth and exact gradient The residual of the governing equation defines the physics-based component of the loss:
3 The Physics-Informed Neural Network (PINN V.
where the damping ratio, w0 the natural frequency, and Ncoll the number of collocation points.
In contrast to purely data-driven neural networks, the Physics-Informed Neural Network (PINN V.
integrates governing physical laws of structural dynamics directly into the learning process.
This hybrid formulation reduces the modelAos reliance on large, labeled datasets while maintaining consistency with the mechanical behavior of the structure.
The PINN loss function is expressed as a weighted sum of two terms:
Ltotal = wdata Ldata wphys Lphysics .
A Ldata quantifies the prediction error with respect to the measured displacements.
A Lphysics penalizes the violation of the governing differential equation.
A wdata and wphys are weighting coefficients that balance data fidelity against physical consistency.
For the structural dynamics of the Z24 Bridge, the physics residual is derived from the equation of motion of a damped single-degree-of-freedom (SDOF) oscillator:
= F .
Lphysics = Oc Ncoll i=1 .
i ) 2w0 uN.
i ) w02 u.
i ))2 .
In the present implementation, we considered a pure physics configuration .
data = 0.
0, wphys = 1.
to emphasize physical consistency.
However, the framework allows the introduction of data-driven supervision by adjusting wdata > 0 when sufficient highquality measurements are available.
This adaptability highlights the main advantage of PINNs: they can function either as hybrid dataAephysics models or as physics-dominant solvers, depending on application needs (Bowman et al.
, 2023.
Farea et al.
, 2024.
Torres et al.
, 2025.
Anon.
, 2025a,.
4 Training Protocol All models (NN V1AeV3 and PINN V.
use the same sensor .
, identical preprocessing, and the same stratified train/validation/test split with fixed random No data augmentation was applied.
Hyperparameters were tuned once per model family using the validation split.
This design isolates the family effect .
urely data-driven vs.
physics-informe.
, preventing configuration artifacts from influencing the trade-off All models were implemented in TensorFlow/Keras.
The following training protocol was applied:
Journal of the Civil Engineering Forum Vol.
12 No.
2 (May 2.
ces, macro precision/recall/F1, and macro AUC to emphasize false negatives and robustness beyond overall A Optimizer: Adam .
nitial lr=0.
001 for NN, 0.
for PINN).
A Batch size: 64 for NN V1AeV2, 128 for NN V3, and 64 for PINN.
A Epochs: 100 (NN V.
, 200 (NN V.
, 250 (NN V.
, 150 (PINN).
A Regularization: Dropout .
2Ae0.
L2 penalty .
Oe3 ).
A Early stopping: patience = 20 epochs on validation A Normalization: StandardScaler fitted on training data only.
A Loss functions: Categorical cross-entropy (NN), hybrid loss (PINN).
Random seeds were fixed, and results were averaged over three runs to ensure reproducibility.
4 RESULTS
The performance of the developed models was evaluated using the Z24 Bridge dataset.
Three purely datadriven neural networks (NN V1AeV.
and one physicsinformed neural network (PINN V.
were trained and validated on the same experimental data.
This section presents the quantitative metrics, training behavior, physical diagnostics, and comparative analyses among the different model versions.
1 Training and Validation Behavior 5 Hyperparameters Summary The evolution of training and validation losses illustrates the progressive improvement across the three NN variants.
The main hyperparameters for each model are summarized in Table 3.
NN V1 .
hallow multilayer perceptron, 100 epoch.
exhibited comparatively high variance in validation accuracy.
Although convergence was achieved after approximately 60 epochs, noticeable fluctuations in accuracy and loss were observed .
ee Figure .
6 Implementation Details A Hardware:
Models were trained on HPCMARWAN, using CPUs and GPUs when available.
A Software: Python 3.
TensorFlow 2.
SciPy, scikit-learn.
Matplotlib.
A Runtime: Training NN V3 required 1.
5h on GPU, while PINN V1 took 2.
3h due to the physics loss.
A Evaluation metrics: Accuracy, loss.
ROC-AUC .
, and physics diagnostics (L2.
LO norms of MiniRocket and WaveNet are discussed as literature Because performance is highly sensitive to preprocessing and split strategy, we do not claim strict numerical superiority over external reports.
Instead, we provide a controlled side-by-side comparison of NN V1AeV3 and PINN V1 under a common preprocessing pipeline and dataset split, reporting confusion matri- NN V2 .
eeper architecture, 200 epoch.
demonstrated smoother convergence, with validation accuracy stabilizing around 97.
25% .
ee Figure .
NN V3 .
ptimized architecture with advanced regularization, 250 epoch.
showed the most stable behavior, reaching a validation accuracy of 97.
71% .
ee Figure .
with reduced variance.
Figures 3Ae5 are explicitly discussed in the text to highlight the distinct convergence profiles of NN V1AeV3.
Each figure is referenced where its behavior is analyzed to avoid presenting visuals without narrative context.
The consolidated validation metrics in Table 4 confirm Table 3.
Hyperparameters of NN and PINN models.
Model
Layers / Units Activation Optimizer Learning Rate
Epochs Batch
Size
NN V1
2 hidden , .
ReLU
Adam
NN V2
3 hidden , 64, .
ReLU
Adam
NN V3
4 hidden ,128,64,.
ReLU
Adam
PINN V1
3 hidden , 64, .
Tanh Adam 1e-3 Ie 1e-5 .
0 / 1.
ure physic.
Ae tunable Notes Baseline Deeper.
Improved convergence and Physics-informed loss function .
ata term optiona.
Vol.
12 No.
2 (May 2.
Journal of the Civil Engineering Forum Figure 3.
Accuracy/Loss curves for NN V1.
Figure 4.
Accuracy/Loss curves for NN V2.
that all NN variants achieve macro-level performance above 97% accuracy, with NN V3 attaining a perfect macro-AUC .
on the validation split.
Because the physics residual term was added to the loss function.
PINN V1 exhibited slightly slower convergence, as expected.
However, its validation performance remained stable .
pproximately 93%) with minimal oscillations.
After analyzing the training and validation behavior, we summarize the main validation results for NN V1Ae V3 and PINN V1.
Table 4 reports macro-level metrics used throughout the discussion.
Figures 3Ae5 display the training and validation trajectories for NN V1.
NN V2, and NN V3, respectively .
istinct colors for training versus validation, larger fonts, and thicker lines for readability in a two-column layou.
All three networks converge stably, with NN V3 exhibiting the smoothest validation curve and fastest The gap between training and validation curves remains modest for NN V1AeV3, indicating controlled overfitting under the shared split and preprocessing pipeline.
Tables 4 and 5 together consolidate the validationset evidence for both model families.
Table 4 reports macro-level classification metrics for NN V1AeV3, while Table 5 summarizes physics diagnostics for PINN V1.
Journal of the Civil Engineering Forum Vol.
12 No.
2 (May 2.
Figure 5.
Accuracy/Loss curves for NN V3.
Table 4.
Validation metrics .
ccuracy, macro-F1, macro ROCAeAUC.
FN) for NN V1AeV3 on the Z24 validation split.
AuAiAy = not applicable Metric Val.
Accuracy (%) Macro-F1 (%) Macro ROCAeAUC
False Negatives NN V1
NN V2 NN V3
Table 5.
Physics diagnostics for PINN V1 on Z24 .
lassification metrics not directly comparabl.
Metric |.
P 95(.
) w0 .
est Oe w0 |/w0 (%) Value NN V3 delivers the strongest macro-level performance .
71%, macro-F1 97.
60%, macro ROCAeAUC .
with zero false negatives, while NN V1 and NN V2 remain close .
71% and 97.
25% accuracy, respectivel.
For PINN
V1,
(|.
OO1.
205y10Oe3 .
P 95(.
)OO6.
66y10Oe2 ) and a coherent low-frequency signature .
0 OO 4 Hz, west OO 6.
indicate physically plausible trajectories, although classification metrics are not directly comparable.
In Section 5.
2, we complement these findings with test-set results, and in Section 5.
4, we discuss when physics-informed regularization may be preferable to purely data-driven learning.
2 Quantitative Performance Metrics On the held-out test set.
NN V3 achieves 99.
54% accuracy with a macro ROCAeAUC of 1.
000, confirming the validation ranking .
ee Tables 5 and .
Nevertheless.
PINN V1 offers complementary advantages: although slightly less accurate, its predictions maintain physical consistency and are less prone to overfitting.
Notably, the number of false negatives is zero across NN V1AeV3 on the validation set, reducing the risk of missed damage at the macro level.
3 Physics-Informed Diagnostics For PINN V1, physics residuals remain low (|.
OO1.
205y10Oe3 .
P 95(.
)OO6.
66y10Oe2 ), and both w0 OO 4 Hz and west OO 6.
9 Hz lie within the low-frequency band of interest .
ee Table .
These findings indicate that PINN V1 respects the systemAos physical dynamics, ensuring that the learned representation reproduces structural vibration behavior in addition to classifying states.
Both plots confirm that the residual energy is primarily concentrated at low frequencies (<10 H.
, consistent with experimental observations and corresponding to the dominant vibration modes of the Z24 Bridge.
4 Comparative Analysis The comparison between the NN and PINN families highlights two complementary strengths:
NN V3 outperforms earlier versions and approaches state-of-the-art data-driven methods Vol.
12 No.
2 (May 2.
such as MiniRocket and WaveNet .
iterature benchmark.
, achieving the highest raw classification performance .
Table .
Although slightly less accurate.
PINN V1 provides the distinct advantage of residual diagnostics and physical interpretability, both of which are essential in structural engineering practice.
Thus, while purely data-driven models maximize predictive performance, hybrid approaches offer robustness, interpretability, and greater confidence for engineering decision-making.
5 Key Findings A From NN V1 Ie NN V3, there is a noticeable increase in accuracy and stability.
A PINN V1 maintains consistency with physical dynamics while achieving comparable accuracy.
A The physical coherence of PINN residuals is confirmed by the diagnostic plots .
Figure .
A Hybridization .
ata physic.
offers a balanced solution, particularly suitable for SHM applications.
Journal of the Civil Engineering Forum rameter tuning advance.
Under the unified split and preprocessing pipeline.
NN V1AeV3 achieve validation accuracies of 97.
71%, 97.
25%, and 97.
71%, respectively (Table .
, with variance decreasing from V1 to V3 due to increased depth and stronger regularization.
Nevertheless, despite their high accuracy, neural networks remain highly dependent on the quality and quantity of available data, and their interpretability is limited.
Performance can also be sensitive to training configurations, particularly in earlier variants, where higher variance and less stable convergence are observed.
In practice, compact transform-based methods such as MiniRocket are attractive when computational efficiency and deployment simplicity are priorities, whereas WaveNet-style dilated convolutions are well suited for longer receptive fields at the expense of heavier tuning.
NN V3 achieves comparable macro-level metrics using a simpler MLP architecture, while PINN V1 sacrifices some raw accuracy in exchange for improved physical coherenceAian important feature when operational decisions depend on interpretable structural dynamics rather than purely black-box predictions.
2 Contributions and limitations of PINN V1 The PINN V1 model adopts a distinct philosophy.
Rather than relying solely on data, it explicitly incorporates the differential equations governing structural The results show that PINN V1 achieves accuracy comparable to NN V2 (OO92%) with a similar AUC (O0.
However, its primary contribution lies in robustness: even under sparse or noisy data conditions, predictions remain dynamically consistent by adhering to the physical constraints of a damped oscillator representative of the Z24 Bridge.
This is particularly important in SHM, where real damage data are often scarce.
Because the loss function includes physics-based terms and relies on automatic differentiation, the approach entails higher computational cost.
Moreover, selecting the weighting coefficients for the physics component .
phys ) and the data-driven component .
data ) remains a critical design choice that must be calibrated according to the specific case study.
Figure 6.
Residual distribution and power spectral density (PSD) of the PINN V1 residuals.
5 DISCUSSION
1 Performances of data-driven models (NN V1AeV.
The three generations of neural networks (NN V1.
NN V2, and NN V.
demonstrate progressively improved performance as architectural complexity and hyperpa- While the SDOF prior stabilizes learning and enhances interpretability, it inevitably abstracts highermode and coupling effects.
Two implications follow.
First, physics residuals reflect consistency relative to dominant-mode dynamics rather than full MDOF behavior.
Second, spectral mismatches at off-resonant frequencies may arise when additional modes are excited.
This helps explain the gap between the reference frequency w0 and the estimated west in our diagnostics (Table .
and limits the claims for PINN V1 to operating regimes dominated by the first mode.
Journal of the Civil Engineering Forum Vol.
12 No.
2 (May 2.
3 Comparison with recent methods (MiniRocket and WaveNe.
mance and physical coherence, addressing a critical gap in SHM applications.
Placing our results in the context of existing literature provides useful perspective.
In time-series classification.
MiniRocket (Dempster.
Schmidt and Webb, 2.
has demonstrated strong performance, achieving accuracies of approximately 91Ae94% with exceptional computational efficiency.
However, its interpretability in SHM is limited because it relies solely on statistical transformations and does not incorporate physical WaveNet-based approaches adapted for vibration analysis have reported accuracies between 90% and 93% (Wang and Liu, 2.
Although their dilated convolutional architecture is well suited for sequential data, performance remains sensitive to hyperparameter tuning and does not guarantee physical consistency.
Although their dilated convolutional architecture is well suited for sequential data, performance remains sensitive to hyperparameter tuning and does not guarantee physical consistency.
From a practical standpoint, these findings suggest a differentiated usage:
A NN V3 and similar deep architectures are best suited for data-rich environments with continuous multi-sensor monitoring.
A PINN V1 and related variants are more appropriate in scenarios with sparse or noisy data, where physical consistency is essential for engineering decision-making.
A Hybrid frameworks combining the predictive strength of NNs with the robustness of PINNs represent a promising direction for future SHM A The use of a single accelerometer restricts damage observability to local modal content and may miss spatial patterns detectable by sensor arrays.
This limits generalizability to large-span bridges with mode coupling and environmental confounding.
Therefore, the present results should be viewed as a lower-bound demonstration.
multisensor PINNs and sensor-fusion NN architectures represent natural next steps.
4 Critical analysis and practical implications The comparative results summarized in Table 6 clearly illustrate the trade-offs among the tested approaches.
Data-driven models (NN V1AeV.
progressively improve in validation accuracy and AUC as architectural depth and regularization increase, culminating in NN V3 with 7% validation accuracy and a perfect macro ROCAeAUC .
However, these gains come with strong dependence on training data and limited interpretability, which may restrict direct adoption in structural engineering practice.
Reported figures for MiniRocket and WaveNet are indicative, as their absolute performance is highly sensitive to preprocessing, windowing strategy, and class Direct re-implementations under a unified split are planned as future work.
here, literature values are used only to contextualize trends.
By contrast.
PINN V1 achieves slightly lower accuracy (O92%) but ensures physical consistency by embedding governing motion equations into the learning process.
This enhances robustness under noisy or limited data conditionsAian essential feature in SHM, where damage scenarios are rarely abundant.
Although computational cost is higher, the gains in interpretability and reliability make PINNs particularly attractive for realworld bridge monitoring.
MiniRocket provides competitive accuracy .
Ae94%) with exceptional speed but remains purely statistical.
WaveNet achieves 90Ae93% accuracy yet remains sensitive to hyperparameter selection and lacks physical guarantees.
Compared with these approaches, the proposed PINN framework balances predictive perfor- MiniRocket and WaveNet (Table .
are included as literature baselines to contextualize model selection.
Because reported performance is sensitive to preprocessing, windowing, and split strategy, we do not claim strict numerical superiority over external studies.
Instead.
Table 6 contrasts algorithmic characteristics and typical use cases: MiniRocket offers extremely fast feature extraction with competitive accuracy, making it suitable for rapid screening or edge deployment, while WaveNet captures rich temporal dependencies at the cost of higher computational demand.
In our controlled setting with a unified preprocessing pipeline.
NN V3 achieves excellent accuracy and macro-AUC, whereas PINN V1 sacrifices a small degree of raw accuracy in exchange for interpretability and physicsguided robustness.
5 Limitations and perspectives Our study has several limitations.
First, the analysis relies on signals from a single sensor of the Z24 Bridge, whereas multi-sensor networks would better capture global vibration modes.
Second, the adopted physical modeling .
quivalent SDOF) remains a simplification of the actual dynamics of a multi-span bridge.
Third.
PINN training is computationally demanding, requiring the use of the HPC-MARWAN cluster.
These limitations open clear perspectives (HPC-MARWAN, 2.
A Development of HPINNs (Hierarchical PINN.
or SPINNs (Sparse PINN.
capable of handling multidegree-of-freedom structures.
A Integration of multi-sensor data to capture richer vibration signatures.
Vol.
12 No.
2 (May 2.
Journal of the Civil Engineering Forum Table 6.
Comparative summary of NN and PINN models versus baselines on the Z24 dataset .
alidation/test accuracy and macro AUC).
Literature baselines are indicative ranges.
see cited works.
Model
Type
Test
Accuracy (%) Macro
ROCAeAUC
Strengths NN V1
Data-driven
Simple, fast baseline NN V2
Data-driven
NN V3
Data-driven
PINN V1
Hybrid (PINN) O92% O0.
WaveNet* Baseline .
O90Ae93% O0.
Deeper, more stable Best accuracy, fast convergence Physically consistent, robust Good on sequences .
D con.
MiniRocket* Baseline .
O91Ae94% O0.
Very fast, scalable Weaknesses High variance, less stable Still data-dependent Black-box, overfitting risk Slower training, costly HPC Requires tuning, less robust Purely statistical, no physics *Dabbous et al.
A Optimization of wdata/wphys weights through adaptive or Bayesian strategies.
A Broader exploitation of HPC infrastructures to accelerate training.
All results are based on a single vertical accelerometer .
to ensure like-for-like comparability across NN and PINN models.
While this reduces variability due to sensor heterogeneity, it also limits sensitivity to localized or mode-specific effects.
Extending the approach to multi-sensor fusion and to multi-degree-offreedom PINNs that encode multiple modes represents an important next step to improve generalization in real-world deployments.
6 CONCLUSION AND FUTURE WORK
This study presented a comparative analysis of three successive generations of data-driven neural networks (NN V1AeV.
and a physics-informed neural network (PINN V.
applied to the well-established Z24 Bridge benchmark dataset.
NN V3 achieved the highest predictive performance, reaching approximately 97.
validation accuracy with a macro ROCAeAUC of 1.
outperforming earlier versions (NN V1 and NN V.
PINN V1, although slightly less accurate (O92%), demonstrated superior training stability and ensured consistency with the underlying structural dynamics.
These findings confirm that physics-informed approaches complement purely data-driven methods by embedding engineering knowledge directly into the learning process (Raissi et al.
, 2019.
Dabbous et al.
Zhang and Sun, 2.
From a practical standpoint, the two paradigms offer distinct yet complementary advantages for structural health monitoring.
Purely data-driven networks such as NN V3 are effective in scenarios with abundant training data and can achieve state-of-the-art classification performance comparable to baselines such as WaveNet (Wang and Liu, 2022.
Dabbous et al.
and MiniRocket (Dempster et al.
, 2.
However, their black-box nature and dependence on large datasets limit interpretability and robustness.
In contrast.
PINNs provide physically consistent outputs that are more interpretable for engineers, particularly in safety-critical contexts where prediction reliability is This balance between accuracy and interpretability underscores the value of hybrid methods in real-world SHM applications (Karniadakis et al.
, 2021.
Sun et al.
, 2.
Looking ahead, several perspectives emerge.
First, hybrid extensions such as Hierarchical PINNs (HPINN) and Sensor-informed PINNs (SPINN) could further improve generalization by combining multiple physical constraints with heterogeneous sensor data.
Second, future research should validate these models using multi-sensor datasets and long-term monitoring campaigns to better capture environmental and operational variability.
In data-rich monitoring regimes with stable instrumentation, purely data-driven models such as NN V3 remain state-of-the-art for macro-level classification.
In data-scarce or safety-critical settings, physicsinformed formulations such as PINN V1 trade a small amount of raw accuracy for improved robustness and interpretability by constraining learning to physically credible dynamics.
We therefore advocate hybrid pipelines that select or combine these paradigms based on data availability and operational risk.
Additionally, we will leverage the HPC-MARWAN cluster to scale both model families and accelerate training for largescale infrastructure applications.
Journal of the Civil Engineering Forum Vol.
12 No.
2 (May 2.
CONFLICT OF INTEREST STATEMENT Civera.
and Surace.
AoA comparative analysis of signal decomposition techniques for structural health monitoring on an experimental benchmarkAo.
Sensors 21.
, 1825.
URL: https://doi.
org/10.
3390/s21051825 The authors declare that there is no conflict of interest regarding the publication of this paper.
REFERENCES