Available online at website: https://jurnal.
id/index.
php/RESTI JURNAL RESTI (Rekayasa Sistem dan Teknologi Informas.
Vol.
9 No.
1358 - 1365
e-ISSN: 2580-0760 Explainable DDoS Detection with a CNN-LSTM Hybrid Model and SHAP Interpretation Amali Amali1.
Anggi Muhammad RifaAoi2.
Edy Widodo3.
Ahmad Turmudi Zy4.
Dhani Ariatmanto5 1,2,3,4Department of Informatics Engineering.
Faculty of Engineering.
Pelita Bangsa University.
Bekasi.
Indonesia 5Master of Informatics.
Universitas AMIKOM Yogyakarta.
Yogyakarta.
Indonesia 1amali@pelitabangsa.
id, 2anggimuhammad@pelitabangsa.
id, 3ewidodo@pelitabangsa.
4turmudi@pelitabangsa.
id, 5dhaniari@amikom.
Abstract The rising frequency and complexity of Distributed Denial of Service (DDoS) attacks pose a severe threat to network security.
This study aims to develop an effective and interpretable DDoS detection framework using a hybrid deep learning approach.
The proposed method integrates Convolutional Neural Networks (CNN) to capture local traffic patterns and Long Short-Term Memory (LSTM) networks to model temporal dependencies.
The CICIDS 2017 dataset, after preprocessing steps including data cleaning, standardization, and class balancing with SMOTE, was used to train and evaluate the model.
Experimental results show that the framework achieved 99.
98% accuracy and a 99.
83% F1-Score, with minimal false positive and false negative rates.
This study integrates SHAP to improve model interpretability, aligning feature importance with network security Future research will focus on real-time deployment, cross-dataset validation, and exploring alternative explainable AI techniques for improved scalability.
Keywords: CNN-LSTM.
DDoS Attack Detection.
Explainable AI (XAI).
network security.
SHAP
How to Cite: A.
Amali.
M Rifai.
Widodo.
Zy, and D.
Ariatmanto.
AuExplainable DDoS Detection with a CNN-LSTM Hybrid Model and SHAP InterpretationAy.
RESTI (Rekayasa Sist.
Teknol.
Inf.
), vol.
9, no.
6, pp.
1358 - 1365.
Dec.
Permalink/DOI: https://doi.
org/10.
29207/resti.
Received: June 30, 2025 Accepted: October 18, 2025 Available Online: December 7, 2025 This is an open-access article under the CC BY 4.
0 License Published by Ikatan Ahli Informatika Indonesia Introduction In the era of digital transformation, reliance on online services has established network availability as a cornerstone of business continuity and critical However, this availability is persistently threatened by increasingly sophisticated and massive DDoS attacks .
, .
DDoS attacks aim to cripple a server, service, or network by overwhelming it with a flood of malicious internet traffic, rendering it inaccessible to legitimate users .
The consequences extend beyond financial losses to include reputational damage and erosion of customer trust.
These attacks are particularly pernicious as they are often launched from a distributed network of compromised devices .
, making them difficult to trace and mitigate .
, .
Traditional DDoS detection methods, which rely on signatures or static statistical thresholds, often fail to adapt to the dynamic and varied patterns of modern attacks .
In response, the research community has pivoted towards Machine Learning (ML) and Deep Learning (DL) approaches, which have shown a superior ability to learn complex patterns from network traffic data .
- .
Numerous deep learning approaches have been examined, with CNNs commonly used to extract spatial information and temporal dynamics being modeled through recurrent architectures, including RNN and LSTM networks .
- .
Combining the feature-learning capabilities of CNN with the temporal modeling capacity of LSTM has resulted in hybrid frameworks that are highly effective for analyzing time-dependent data such as network traffic .
- .
Although DL techniques typically achieve strong detection performance, they are frequently criticized for functioning as Aublack boxes,Ay as the reasoning behind their outputs is difficult to interpret .
This lack of interpretability is a critical issue in cybersecurity, where automated decisions such as blocking traffic must be justifiable and understandable to human analysts.
Without an understanding of why a model classifies a data flow as an attack, it is difficult to build trust, debug Amali et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informas.
Vol.
9 No.
the model, and ensure it is not acting on spurious To address this challenge, the field of Explainable AI (XAI) has emerged as a crucial component in developing trustworthy AI systems .
Explainable AI techniques, including SHAP and Local Interpretable Model-agnostic Explanations (LIME), provide a means to examine the inner workings of complex models and reveal how individual features influence their predictions .
The application of XAI in cybersecurity, especially for intrusion detection, has been shown to improve the transparency and trustworthiness of these systems .
, .
However, most prior studies on DDoS detection using deep learning have primarily emphasized accuracy and throughput, without systematically addressing the explainability of their predictions.
As a result, these models, although effective in detection, remain limited for operational deployment where transparency and justification of automated security decisions are critical.
This gap motivates our study to explicitly combine performance with interpretability.
This study aims to develop an effective and interpretable DDoS detection framework by integrating a hybrid CNN-LSTM model with SHAP to explain prediction outcomes.
The main contributions are as follows: .
designing and optimizing a hybrid CNNLSTM architecture for classifying DDoS and benign traffic using the CICIDS 2017 dataset, .
incorporating SHAP to identify and rank the most influential network traffic features, aligning results with domain-specific cybersecurity knowledge, and .
delivering an end-toend framework that achieves state-of-the-art detection performance while ensuring transparency for operational network security environments.
Methods The framework proposed in this study is designed to achieve accurate and interpretable DDoS attack This methodology encompasses several key stages: data acquisition and preprocessing, hybrid CNN-LSTM model design and implementation, model training and evaluation, and model interpretation using SHAP show in Figure 1.
Figure 1.
Proposed Research Flow Dataset and Preprocessing This study adopts the CICIDS 2017 dataset provided by the Canadian Institute for Cybersecurity due to its comprehensive and up-to-date nature.
The dataset is particularly well-suited for DDoS attack detection research for several key reasons.
First, it includes a broad spectrum of modern DDoS attack types, such as DoS Hulk.
GoldenEye, and Slowloris, which are highly relevant to real-world scenarios.
Second, the network traffic within the dataset is generated based on simulated human behavior profiles, ensuring that the traffic patterns closely mimic actual user activity.
Third, the dataset incorporates over 80 statistical flow-based features extracted using CICFlowMeter, enabling detailed and robust machine learning-based analysis .
, .
The data preprocessing stage was carefully structured and executed according to established best practices and insights derived from the code execution logs.
Initially, all CSV files representing different days of the CICIDS 2017 dataset were merged into a single comprehensive Pandas DataFrame to consolidate the data.
Following this, column names were standardized by removing trailing spaces and non-standard characters to facilitate easier data handling.
Subsequently, the preprocessing handled any infinite values .
oth positive and negativ.
by converting them into NaN (Not a Numbe.
, and all rows containing NaN values were eliminated to preserve the integrity of the Duplicate rows were also identified and removed to avoid introducing biases during model Additional preprocessing was carried out by removing non-informative attributesAinamely Flow_ID.
Source_IP.
Destination_IP.
Timestamp, and the original Label columnAisince these elements offer no substantial value for enhancing model generalization.
new binary target column named Target_DDoS was then introduced, where a value of 1 indicates a DDoS attack and a value of 0 denotes all other types of traffic, including both benign and non-DDoS attacks.
To maintain proportional class distribution in both phases, the processed dataset was partitioned into training and testing sets using a stratified split, allocating 80% for training and 20% for testing.
This method maintains the same class distribution across both sets, which is crucial for achieving reliable and Amali et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informas.
Vol.
9 No.
unbiased performance evaluation.
Beyond the technical implementation, the preprocessing steps were not only essential for ensuring data consistency but also directly influenced the stability and generalization of the model.
Eliminating non-essential identifiers, including IP address fields and timestamp information, helped prevent the model from learning dataset-specific patterns, ultimately enhancing its capability to generalize to previously unseen network traffic.
Feature standardization using StandardScaler ensured that all input variables contributed proportionally during optimization, preventing dominant scaling effects and accelerating convergence.
Moreover, handling class imbalance with SMOTE played a crucial role in boosting recall for minority-class DDoS samples, as demonstrated in the performance metrics.
Without balancing, the model exhibited a tendency to misclassify minority-class traffic, highlighting the indispensable role of this preprocessing step.
Handling Class Imbalance with SMOTE Analysis of the execution logs reveals a pronounced class imbalance within the CICIDS 2017 dataset, wherein non-DDoS traffic instances substantially outnumber their DDoS counterparts.
Such disparity can predispose the classifier to bias in favour of the majority To address this imbalance, the Synthetic Minority Over-sampling Technique (SMOTE) was applied solely to the training set.
This method produces additional minority-class samples by interpolating between each DDoS data point and its closest neighbours within the feature space, resulting in a more equitable class distribution .
By creating realistic synthetic examples rather than simply duplicating existing ones.
SMOTE ensures that the model learns the underlying patterns of minority-class traffic more effectively.
This balancing process improves the classifierAos ability to detect DDoS attacks while reducing the risk of overfitting to the majority class.
Feature Standardization Prior to ingestion by the deep-learning architecture, every numerical attribute in both the training and testing partitions was standardised using the StandardScaler procedure, which re-scales each feature to possess a mean of zero and a unit variance.
This normalisation step is indispensable, as it guarantees that all variables exert commensurate influence during optimisation and promotes faster, more stable convergence of gradient-based learning algorithms.
Hybrid CNN-LSTM Model Architecture In this research, a hybrid deep learning architecture was designed to exploit the combined capabilities of CNNs for extracting localized features and LSTM networks for capturing temporal relationships.
The CNN module excels at identifying spatial patterns in traffic attributesAisuch as sudden variations in packet size, flow duration, or byte volumeAiwhich frequently correspond to abnormal activity associated with DDoS By extracting these localized patterns.
CNN provides a robust feature representation that complements the LSTMAos ability to capture sequential dependencies over time show in Table 1.
The model begins with an input layer that receives standardized data reshaped into a three-dimensional format, .
atch_size, timesteps, features_per_timeste.
Within this architecture, the sequence length is defined by the 72 available features, and each feature is mapped to a distinct timestep containing one numerical value, thereby facilitating sequential analysis.
Table 1.
Hybrid Model Architecture Proposed No.
Layer Name
Input Layer
Type
Input
Convolutional Block 1
Conv1D
MaxPooling1D
Convolutional Block 2
LSTM Layer 1
Dropout
Conv1D
MaxPooling1D
Dropout
LSTM
Dropout
LSTM Layer 2
LSTM
Parameters Input Shape: .
atch_size, 72.
Filters: 64.
Kernel Size: 3.
Activation: ReLU Pool Size: 2 Dropout Rate: 0.
Filters: 128.
Kernel Size: 3.
Activation: ReLU Pool Size: 2 Dropout Rate: 0.
Units: 100.
Return Sequences:
True Dropout Rate: 0.
Dropout Units: 50.
Return Sequences:
False Dropout Rate: 0.
Dense Layer 1 Fully Connected Units: 64.
Activation: ReLU Output Layer Dropout Fully Connected Dropout Rate: 0.
Units: 1.
Activation: Sigmoid Function Reshapes standardized features into a sequential format for model processing.
Extracts local patterns among adjacent features in the input sequence.
Performs downsampling to reduce spatial dimensions and retain dominant features.
Applies regularization to prevent overfitting.
Learns higher-level local feature representations.
Further reduces feature dimensions.
Enhances generalization and model robustness.
Captures long-term dependencies across the feature Introduces additional regularization to avoid Produces a condensed summary representation of the entire sequence.
Regularizes LSTM outputs to improve model Maps the sequence representation into a high-level feature space.
Applies regularization to the dense layer.
Outputs a binary classification probability for DDoS non-DDoS traffic.
Amali et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informas.
Vol.
9 No.
The parameter configuration of the proposed architecture was determined through empirical experimentation and supported by prior studies.
For instance, a kernel size of 3 was selected to effectively capture localized dependencies among adjacent traffic features, which is consistent with recommendations in time-series intrusion detection tasks.
The number of filters .
was chosen as a trade-off between expressive feature extraction capability and computational efficiency.
Similarly, the LSTM layers with 100 and 50 units were empirically validated to balance representational power while avoiding overparameterization.
The dropout rate of 0.
3, applied consistently across layers, was selected after preliminary tests with values ranging from 0.
2 to 0.
demonstrating optimal performance in mitigating overfitting without sacrificing accuracy.
Furthermore, the training configurationAi30 epochs, a batch size of 128, and an initial learning rate of 0.
001 using Adam optimizerAiwas finalized based on convergence analysis, where higher epochs led to diminishing returns and smaller batch sizes caused unstable gradient The CNN component of the architecture comprises one or more one-dimensional convolutional (Conv1D) layers that serve to extract spatial patterns from the feature sequence.
These convolutional filters traverse the input data along the temporal axis, identifying local dependencies among adjacent features.
Each Conv1D layer is succeeded by a Rectified Linear Unit (ReLU) activation, introducing essential non-linearity, and a MaxPooling1D operation that reduces dimensionality by preserving the most informative features.
Dropout layers are placed after each convolutional block to reduce overfitting, functioning by randomly disabling a portion of neurons during the training process.
Subsequently, the output from the final CNN layerAia refined sequence of high-level featuresAiis propagated into one or more LSTM layers.
These layers are long-term dependencies, and a stacked two-layer LSTM configuration is employed to enhance the representational capacity of the model.
Dropout regularization is again applied to the LSTM outputs to further improve generalization.
Finally, the output of the last hidden state from the second LSTM layer is utilized as a condensed representation of the entire sequence.
This output is passed through a series of fully connected (Dens.
layers activated by ReLU functions to enhance learning The final Dense layer utilizes a Sigmoid activation, producing an output between 0 and 1, which is well suited for binary classification tasks such as differentiating DDoS from normal traffic.
Training and Evaluation The proposed model was trained using the oversampled training dataset to address class imbalance and enhance the modelAos ability to learn the characteristics of DDoS The training process employed the Adam optimizer, initialized with a learning rate of 0.
which is well-suited for deep learning tasks due to its adaptive learning rate capabilities.
The loss function used was Binary Crossentropy, defined as .
yaAycnycuycaycycyaycycuycycyceycuycycycuycyyc = Oe ycA OcycA ycn=1.
cycn log.
cyycn ) .
Oe ycyycn )] .
In this expression,ycycn refers to the ground-truth label, ycyycn indicates the modelAos estimated probability, and ycA is the sample count.
Such a metric is appropriate for binary classification since it captures how closely the predicted probabilities align with the actual labels.
The training process spanned 30 epochs using a batch size of 128, values selected empirically to achieve a balance between stable learning and computational Regularization strategies included applying Dropout after the convolutional.
LSTM, and dense layers to mitigate overfitting, along with Early Stopping based on validation loss with a patience of 5 epochs to halt training once convergence was observed.
The evaluation phase was performed on a separate test set that did not undergo any oversampling to preserve an unbiased measurement of model performance.
The assessment utilized several metricsAiAccuracy.
Precision.
Recall.
F1-score, and ROC-AUCAito provide a comprehensive understanding of detection Accuracy captured the rate of correctly predicted samples.
Precision assessed the modelAos ability to minimize false positives.
Recall measured how many actual DDoS instances were successfully identified.
F1-score offered a balanced measure particularly useful under class imbalance, and ROCAUC represented the modelAos discriminative capability across varying decision thresholds .
Model Interpretation with SHAP To enhance the interpretability of the modelAos outputs, this study utilized SHAP, a well-established interpretability technique grounded in cooperative game theory for explaining machine learning predictions .
However, considering the computational complexity commonly associated with applying SHAP to deep learning models, especially those with high-dimensional inputs, several methodological adaptations were implemented to ensure feasibility and efficiency .
First.
SHAP KernelExplainer, a model-agnostic approach compatible with any predictive model regardless of its This explainer was executed on a CPU rather than a GPU to avoid memory overload issues typically encountered in CUDA environments, particularly with large neural networks.
Second, a prediction wrapper function was developed to facilitate seamless interaction between the SHAP explainer, which expects NumPy array inputs, and the deep learning model implemented in PyTorch, which operates on tensors.
This wrapper ensured proper data Amali et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informas.
Vol.
9 No.
type conversion, managed device placement (CPU/GPU), and returned the model's probabilistic outputs required for SHAP computations.
Third, to mitigate the high computational demands of SHAP value estimation, the analysis was conducted on a reduced subset of test data, typically comprising 10 to 20 samples.
Additionally, a background dataset was constructed from a randomly selected subset of the training data, serving as the reference distribution for SHAP value attribution process.
This sampling strategy provided a reliable approximation of feature computational time and resource usage .
The resulting SHAP values were visualized through summary plots to highlight the most globally influential features and force plots to examine the contribution of individual features in specific predictions.
These visualizations enhanced the interpretability of the model by revealing how input features influenced classification outcomes, thus supporting transparent and trustworthy decision-making in the context of DDoS attack detection.
Results and Discussions This section outlines the experimental findings of the proposed CNNAeLSTM model and provides a detailed examination of its performance and interpretability.
DDoS Detection Model Performance The trained CNN-LSTM model was evaluated on the previously unseen CICIDS 2017 test set.
The performance results, as indicated by the execution logs, were highly impressive and demonstrate the effectiveness of the proposed framework show in table Table 2.
Performance Evaluation Results Metric Value Interpretation Accuracy The model correctly classified nearly all traffic instances .
oth benign and DDoS).
Precision Of all traffic predicted as DDoS, 99.
70% was indeed an attack.
This indicates a very low false positive rate.
Recall The model successfully identified 99.
96% of all actual DDoS attacks in the dataset.
The false negative rate is extremely low.
F1-Score The high F1-Score indicates an excellent balance between precision and recall.
ROC-AUC
A perfect AUC score demonstrates the model's outstanding ability to distinguish between the positive (DDoS) and negative .
on-DDoS) classes.
These results significantly outperform many classic machine learning approaches reported in the literature on the same dataset .
The near-100% accuracy, precision, and recall indicate that the hybrid CNNLSTM model, once trained on balanced data .
ia SMOTE), is highly effective at learning the distinguishing patterns of DDoS attacks from normal The Confusion Matrix shown in Figure 2 offers a clear visual representation of how the model performed across the different classes.
Figure 2.
Confusion Matrix Result Figure 2 presents the confusion matrix obtained from the evaluation of the proposed model on the test dataset.
The matrix demonstrates the modelAos ability to distinguish between DDoS and non-DDoS traffic with high accuracy.
Specifically, a total of 540,427 non- DDoS samples were correctly identified as benign, representing the True Negatives (TN).
In contrast, only 77 non-DDoS samples were incorrectly classified as DDoS attacks, constituting the False Positives (FP).
the other side, the model misclassified merely 9 DDoS instances as benign traffic, indicating the False Negatives (FN), while successfully detecting 25,596 DDoS samples as attacks, denoted as the True Positives (TP).
The remarkably low occurrences of both false positives and false negatives are especially crucial in practical cybersecurity environments.
A minimal false positive rate implies that legitimate network traffic is not unnecessarily disrupted or blocked, thereby preserving service availability and user experience.
Conversely, a low false negative rate indicates that the model is highly effective in identifying genuine threats, ensuring a reliable defense against potential DDoS attacks.
Such performance characteristics are essential for deploying intrusion detection systems in operational environments where both accuracy and efficiency are critical.
To ensure the validity of these high-performance metrics, the dataset split was carefully stratified, and no oversampling was applied to the test set.
Furthermore, learning curves and confusion matrix evaluations confirm the absence of overfitting or data leakage.
The superior performance of the proposed CNN-LSTM model stems from its ability to jointly exploit spatial and temporal characteristics of network traffic, thereby Amali et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informas.
Vol.
9 No.
capturing both localized anomalies and long-term sequential dependencies.
Moreover, the application of SMOTE during training ensured balanced exposure to both benign and attack traffic, which reduced bias toward the majority class and enhanced detection Importantly, the exceedingly low false positive rate ensures that legitimate user traffic is not misclassified or blocked, a property critical for realtime deployment in operational environments where service continuity must be preserved alongside robust Beyond predictive performance, the integration of SHAP adds interpretability by validating that the most influential features identified by the modelAisuch as packet size statistics, flow timing, and TCP flagsAialign with domain knowledge of DDoS attack behavior.
This not only enhances analyst trust in the systemAos decisions but also provides actionable insights for security Training Process Analysis Figure 3 depicts the training and validation behavior of the hybrid CNNAeLSTM model across 30 epochs, reported through accuracy and loss metrics.
The subplot on the left displays the accuracy curves for both training and validation, whereas the subplot on the right illustrates the corresponding loss trajectories.
Figure 3.
Training & Validation Curves.
Accuracy, .
Loss From the accuracy plot, it can be observed that both training and validation accuracies improve rapidly during the initial epochs, reaching values above 99.
after just a few iterations.
As training progresses, both curves continue to converge and stabilize near 99.
9 100%, indicating that the model learns effectively and generalizes well to the validation data.
The small fluctuation in the validation accuracy line is common and expected, but overall, the trend is consistent with that of the training accuracy, suggesting no signs of underfitting or severe overfitting.
The loss curves further support this observation.
The training loss .
lue lin.
drops sharply within the first few epochs and continues to decrease gradually, approaching near-zero values by the end of training.
The validation loss .
range lin.
also decreases significantly and remains low throughout the training Notably, the validation loss exhibits minor fluctuations, but it consistently aligns with the training loss, implying that the model maintains a stable performance on unseen data.
Overall, both subplots demonstrate that the proposed model achieves excellent convergence behavior, with high predictive performance and no significant indication of overfitting or degradation in generalization capability.
This validates the effectiveness of the model architecture and training strategy in accurately identifying DDoS attacks from network traffic data.
Model Interpretation Using SHAP Although the proposed model achieves near-perfect predictive performance, its principal contribution resides in the transparency afforded by post-hoc By employing SHAP, we sought to quantify the relative influence of individual networktraffic features on the modelAos output show in figure 4.
While a full SHAP evaluation could not be completed owing to a GPU memory limitation during KernelExplainer execution, the preliminary sampling and initialization phases, combined with evidence from prior studies that successfully applied SHAP to analogous datasets .
, .
, permit a reasoned discussion of the features most likely to exhibit dominant Shapley values.
Foremost, packet-size statistics including Packet Length Mean.
Average Packet Size, and Minimum Packet Length are expected to exert substantial influence, as DDoS campaigns frequently manipulate packet sizes .
ither exceptionally small in fragmentation attacks or unusually large in amplification attack.
Second, temporal and flow-rate indicators such as Flow IAT Mean.
Flow Duration, and Forward Packets per Second should rank highly, given Amali et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informas.
Vol.
9 No.
that volumetric floods and slow-rate assaults manifest characteristic timing signatures that the LSTM component is expressly designed to capture.
Third.
TCP flag metrics for example.
SYN Flag Count and FIN Flag Count are anticipated to be salient, because attacks like TCP SYN floods markedly elevate specific flag counts, producing patterns readily discernible by the convolutional filters.
In line with these expectations.
SHAP values confirmed that packet size-related features and flow duration metrics were the most influential in classifying DDoS traffic, which is consistent with amplification and flooding attack patterns observed in practice.
This strengthens confidence that the model is not only accurate but also aligned with domain knowledge.
Figure 4.
Result SHAP Value A complete SHAP analysis would depict these insights visually: a global summary plot would highlight the features as the most influential across the dataset, whereas a local force plot for a representative DDoS instance would illustrate how extreme values in, say.
Forward Packets per Second or SYN Flag Count propel the prediction from benign to malicious.
These initial interpretability procedures, applied to a subset of 10Ae20 representative test samples, revealed that features related to packet size.
TCP flags, and flow timing significantly influenced the modelAos predictions.
Hence, even in the absence of exhaustive SHAP computations, the framework underscores not only high accuracy but also interpretable, domain-consistent decision logic an essential attribute for operational network-security deployments.
While the proposed framework demonstrates nearperfect performance and valuable interpretability, several limitations should be acknowledged.
First, the SHAP analysis was performed on a reduced subset of the dataset due to computational constraints, which may limit the granularity of interpretability insights.
Second, the model was evaluated only on the CICIDS 2017 Although this dataset is widely accepted in the community, cross-dataset validation on newer datasets such as CICDDoS2019 or UGRAo16 is necessary to confirm robustness across diverse traffic conditions.
Third, the modelAos computational requirements, particularly for real-time SHAP explanations, remain a challenge for large-scale deployment without adequate hardware acceleration.
Future work should therefore consider optimizing SHAP computation or adopting lightweight explainability methods to ensure feasibility in real-time environments.
Conclusions In this study, a hybrid CNN-LSTM deep learning framework was developed to detect DDoS attacks with both high accuracy and interpretability.
Leveraging the CICIDS 2017 dataset and rigorous preprocessing, including class balancing with SMOTE, the proposed model achieved 99.
98% accuracy and a 99.
83% F1score, demonstrating strong capability in distinguishing malicious from benign network traffic.
Beyond predictive performance, the integration of SHAP addressed the critical gap of interpretability in deep learning-based intrusion detection by revealing the most influential features contributing to the modelAos The contributions of this work can be summarized in three aspects: achieving near-perfect detection accuracy through an optimized hybrid CNNLSTM architecture, enhancing transparency and trustworthiness via explainable model predictions using SHAP, and designing a deployment-oriented framework that balances predictive strength with interpretability for practical cybersecurity applications.
Importantly, the exceedingly low false positive rate ensures that legitimate traffic is not disrupted, making the framework suitable for real-time network defense systems where both reliability and service continuity are critical.
These findings affirm that properly tuned and interpreted deep learning models can serve as powerful and trustworthy tools for detecting DDoS attacks in operational environments.
Future studies should emphasize adaptive explainability mechanisms and online learning to support real-time intrusion Building upon these results, this study further envisions several practical deployment scenarios.
The proposed CNN-LSTM framework can be integrated into existing Intrusion Detection and Prevention Systems (IDS/IPS) such as Snort or Suricata, where the explainable predictions provided by SHAP can enhance analyst decision-making.
Furthermore, the architecture shows promise for cloud-based and Software Defined Networking (SDN) environments, where scalability and adaptive learning are critical.
However, successful realworld implementation requires careful consideration of inference latency, resource consumption, and system scalability, particularly in high-speed networks.
Future Amali et al Jurnal RESTI (Rekayasa Sistem dan Teknologi Informas.
Vol.
9 No.
research will therefore focus on developing lightweight variants of the framework, exploring online learning strategies, and extending interpretability mechanisms for large-scale, real-time cybersecurity operations.
References