Indonesian Journal of Electrical Engineering and Informatics (IJEEI) Vol. No. September 2025, pp. ISSN: 2089-3272. DOI: 10. 52549/ijeei. CyberShieldDL: A Hybrid Deep Learning Architecture for Robust Intrusion Detection and Cyber Threat Classification Venkatramulu1*. John Babu Guttikonda2. Desidi Narsimha Reddy3. Madhavi Reddy4. Sirisha5 1Associate Professor. Department of CSE(Network. Kakatiya Institute of Technology and Science. Warangal. Telangana. India. 2Associate Professor. Department of Computer Science & Engineering (AI & ML). Anurag Engineering College. Kodad. Telangana. India. 3Data Consultant (Data Governance. Data Analytics: enterprise performance management. AI & amp. ML). Soniks consulting LLC. USA. H&S (Mathematic. CVR College of Engineering. Ibrahimpatnam. Telangana . India. 5PhD scholar. Department Data science, kL University. Vijayawada. Andhra Pradesh India. Emails: svr. cse@kitsw. johnbabug@gmail. com dn. narsimha@gmail. madhavireddy75@gmail. com msirisha87@gmail. 4Assistant professor. Department Article Info Article history: Received Jul 5, 2025 Revised Aug 17, 2025 Accepted Sep 14, 2025 Keywords: Intrusion Detection System. Deep Learning. CNN-BiLSTM. Attention Mechanism. Network Security ABSTRACT In modern network environments, securing systems from newly emerging attacks is essential, and a constructive approach is the use of an IDS (Intrusion Detection Syste. When faced with attacks that are not in the list of predefined patterns, traditional IDS methods such as signature-based detection or standalone machine learning models may not function properly to detect such attacks because they are not adaptable and not designed to deal with this type of attack. The current IDS systems that employ deep-learning architectures have enhanced detection capabilities. however, most prior art systems are limited by partial feature learning, which only learns features of either spatial or temporal traffic structures. Meanwhile, the lack of contextaware mechanisms, such as attention, limits their ability to attend more to the most informative network components, leading to suboptimal detection performance and generalization. To counter this issue, in this work, we introduce CyberShieldDL, which is the first deep learning-based IDS framework with a novel hybrid architecture: IntruNet-Hybrid, combining Convolutional Neural Networks (CNN) for spatial pattern extraction. Bidirectional Long Short-Term Memory (Bi-LSTM) networks for sequential feature extraction, and an attention mechanism to learn the salient features for intrusion detection dynamically. To create the framework, an optimized preprocessing and feature selection pipeline is presented to effectively and costeffectively prepare the model input. Extensive experiments on the CICIDS2017 dataset demonstrate that CyberShieldDL consistently outperforms the state-of-the-art, achieving an overall accuracy of 98. 35% and high precision, recall, and F1-score in various attack scenarios. Cross-dataset validations on NSL-KDD and UNSW-NB15 also verify the system's The design provides a scalable and flexible solution for realworld network security, offering the flexibility and adaptability necessary to enhance classification accuracy and robustness against evolving attack Its modular construction enables us to extend it for real-time deployment and future adversarial robustness easily. Corresponding Author: Dr. Venkatramulu. Associate Professor. Department of CSE(Network. Kakatiya Institute of Technology and Science. Warangal. Telangana. India. Email: svr. cse@kitsw. Journal homepage: http://section. com/index. php/IJEEI/index A ISSN: 2089-3272 INTRODUCTION The world's society is gradually relying on personalized networks, and the remarkable increase in cybersecurity threats has encouraged the emergence of Intrusion Detection Systems (IDS) for protecting network infrastructures. Existing IDS techniques, originally dependent on signature matching, struggle to identify new and emerging threats, while those based on machine learning tend to exhibit poor adaptability and In recent years, with the proliferation of new AI technologies, intelligent IDS (IDS) systems equipped with deep learning can be used to learn sophisticated patterns in network traffic data . , . However, most existing solutions utilize a separate deep learning architecture, such as Convolutional Neural Networks (CNN) or Long Short-Term Memory (LSTM), which are either weak at modelling temporal dependencies or ignore spatial feature interactions . , . Additionally, another issue is that some research works overlook the attention model, which may help focus on more attack patterns of interest . Recent solutions to these problems have achieved reasonable effectiveness on benchmark datasets. robustness across datasets and reducing false alarms remain challenges. The drawbacks in existing works necessitate a hybrid and adaptive IDS model that can effectively integrate the benefits of spatial and sequential features and adaptively focus on the most informative flow In this paper, we present CyberShieldDL, a hybrid CNNAeBiLSTMAeAttention model for robust intrusion detection. The primary goal of this study is to propose an innovative, flexible, and transferable IDS framework that can effectively monitor various cyberattacks on different networks. The key contributions of this work stem from the hybrid deep learning architecture, feature relevance-driven learning . tilizing an attention mechanis. , and a network traffic analysis-tailored feature selection pipeline. Unlike traditional models, the proposed method models both local and sequential dependencies, assigning different importance to the input for detection, which can significantly enhance detection performance. This integrated approach enables the system to adapt to ever-changing attack models and generalize across various datasets. In contrast to the state-of-the-art CNN-BiLSTM-based models available in the literature for the same type of data. CyberShieldDL offers . a refined feature selection pipeline, . comprehensive cross-dataset validation, . attention-based interpretability analysis, and . statistically robust reporting of results. The main contributions of this work are the generation of the IntruNet-Hybrid model, its thorough experiments on the CIC-IDS2017 dataset, where cross-dataset validation . sing the NSL-KDD and UNSWNB15 dataset. has also been applied, its ablation study that justifies the effectiveness of each model component, and a comparative study that demonstrates superior performance over state-of-the-art The proposed testing also introduced an optimized preprocessing and feature selection pipeline designed explicitly for network intrusion data, which was utilized to ensure model training with scalability and The remainder of the paper is structured as follows: Section 2 reviews related work on deep learningbased IDS. Section 3 presents the proposed methodology detailing the CyberShieldDL framework. Section 4 discusses the experimental results and performance evaluations. Section 5 presents a comprehensive discussion and highlights the study's limitations. Finally. Section 6 concludes the paper and outlines future research RELATED WORK Recent advancements in deep learning have significantly enhanced intrusion detection systems, enabling adaptive, hybrid, and context-aware cybersecurity solutions across domains. Halbouni et al. proposed a CNN-LSTM-based intrusion detection model that leverages the spatial and temporal feature extraction capabilities of deep learning to improve intrusion detection across diverse datasets. Hnamte and Hussain . developed the DCNNBiLSTM model, combining convolutional and recurrent layers to address evolving network vulnerabilities using real-time traffic datasets. Lansky et al. provided a systematic review of deep learning-based IDS frameworks, categorizing them by architecture and highlighting their applications in modern cybersecurity contexts. Gueriani et al. presented a CNN-LSTM hybrid IDS tailored for IoT environments, utilizing recent IoT-specific datasets to enhance the detection of malicious activities. Hnamte et . introduced a two-stage deep learning model that integrates LSTM and autoencoders for the robust detection of complex attack patterns in network traffic. Altunay and Albayrak . focused on industrial IoT security by applying CNN. LSTM, and a hybrid of the two to detect intrusions in industrial network Olanrewaju-George and Pranggono . explored federated learning combined with supervised and unsupervised deep learning models to enhance privacy-preserving intrusion detection in IoT networks. Berguiga et al. proposed HIDS-IoMT, a hybrid CNN-LSTM IDS designed for medical IoT environments using fog computing. Al-Shurbaji et al. conducted a review of deep learning-based IDS approaches targeting IoT botnet attacks. Berguiga et al. developed HIDS-RPL, a CNN-LSTM-based IDS for secure routing in IoMT networks. IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 Fares et al. proposed a hybrid intrusion detection model that combines Swin Transformers and LSTM networks, utilizing transfer learning to address the data scarcity issue in IoT environments. Aboalela et . presented an approach for detecting DDoS attacks in IoT using optimized feature pruning and a hybrid deep learning model that integrates temporal convolutional networks and recurrent units. Al Mazroa et al. introduced a cyberattack detection method for cyber-physical systems (CPS) that utilizes binary metaheuristics and deep learning, with a focus on optimizing feature selection and classification processes. Dayarathne et al. investigated cyber risk mitigation in innovative cyber-physical power systems by integrating deep learning models with hybrid security frameworks to detect anomalies in renewable energy-based grids. Hariharan et al. proposed a hybrid deep learning model that combines Seq2Seq architectures and ConvLSTM subnets, aiming to improve spatial and temporal dependency learning for network intrusion detection. Alhayan et al. presented a deep learning-based IDS for cloud computing environments, integrating CNN-BiLSTM models with spotted hyena optimization for improved cloud security. Manivannan and Senthilkumar . developed the ARNN-FOX model, an adaptive recurrent neural network with a fox optimizer, enhancing intrusion detection through dynamic hyperparameter tuning. Khan et al. designed an adaptive hybrid framework combining artificial neural networks and genetic algorithms for intrusion detection in Industrial IoT Sun and Wang . proposed an image-based neural network classification approach, transforming network traffic into grid structures to leverage CNN-based intrusion detection. Duraibi and Alashjaee . introduced the IMFOHDL-ID approach, combining dimensionality reduction techniques with hybrid deep learning models for IoT cyberattack detection. Huma et al. proposed a hybrid deep random neural network model to enhance cyberattack detection in Industrial IoT environments, addressing diverse threat patterns. Ozkan-Okay et al. introduced SABADT, a hybrid IDS combining anomaly and signature-based techniques for effective cyberattack detection in wireless local area networks. Ho et al. developed a convolutional neural network-based intrusion detection model that can identify both known and novel cyberattacks using advanced feature representations. Elsaeidy et al. presented a hybrid deep learning framework combining CNN and advanced regularization methods to detect replay and DDoS attacks in innovative city environments. ElSayed et al. proposed a CNN-based intrusion detection model for software-defined networks, incorporating a new regularization technique to improve feature generalization. Otoum and Nayak . developed AS-IDS, a hybrid anomaly and signature-based IDS tailored for IoT networks, addressing evolving attack vectors. Rajesh Kanna and Santhi . introduced a unified deep learning framework leveraging integrated spatialAetemporal features for efficient intrusion detection. Jin et al. proposed a signature-based IDS for in-vehicle CAN bus networks, focusing on automotive cybersecurity. Applebaum et al. reviewed signature-based and machine learning-based Web Application Firewalls, outlining trends in web security intrusion detection. Shahriar et al. presented CANShield, a deep learning-based intrusion detection framework designed to target Controller Area Networks at the signal level, thereby enhancing the security of vehicular networks. Yu et al. developed a real-time IDS framework designed to adapt to dynamic network environments using flexible and robust deep learning models. Ben Said et al. proposed CNN-BiLSTM, a hybrid IDS approach for software-defined networking, utilizing hybrid feature selection to enhance detection Kasongo . presented a deep learning framework utilizing recurrent neural networks, with a focus on adaptive learning for intrusion detection in dynamic network environments. Du et al. designed NIDSCNNLSTM, which integrates CNN and LSTM layers for the efficient classification of network intrusion Jerusha et al. proposed a semantic-driven meta-learning model for detecting rare cyberattacks, combining deep learning with semantic analysis techniques. Kocher and Kumar . provided a review of machine learning and deep learning advancements in intrusion detection systems, outlining key developments and challenges in cybersecurity. Fazil et al. introduced DeepSBD, a deep neural network with an attention mechanism for detecting social bot activities in online platforms. Sun et al. developed an anomaly detection method for in-vehicle networks using CNN-LSTM models enhanced with attention mechanisms. Yin et al. applied a Transformer-based model for long-term prediction of network security situations, addressing evolving threat landscapes. Dao et al. presented an attention-enhanced CNN-VAE model for image-based malware classification, improving malware detection in cybersecurity applications. Hybrid deep learning architectures for intrusion detection have recently been highlighted by Udurume et al. When comparing CNNAeBiLSTM with conventional ML methods, . also showed the advantages of temporalAespatial deep modeling. Recently, novel hybrid deep learning-based IDS techniques are published in . Ae. Xu et al. An example of such an approach is the hierarchical hybrid model with attention . that, unfortunately, was evaluated only on a small scale, limiting its generalizability. Hassan et al. achieved scalability with big data, but lacked interpretability and feature optimization. Qazi et al. Although . proposed HDLNIDS, which fuses multiple deep architectures together, it did not perform validation on different datasets. Likewise. Aldallal . created a CNN-RNN hybrid model that is efficient but lacks explainability. Mayuranathan et al. However, . proposed a hybrid DL method that is a cloud-based IDS and achieved a fair CyberShieldDL: A Hybrid Deep Learning Architecture forA (S. Venkatramulu et a. A ISSN: 2089-3272 generalization from one domain to another. Sajid et al. While approaches such as . lack feature selection mechanisms, they combine ML and DL for multi-class IDS. Finally. Sharma et al. ML-DL models were also integrated with evolutionary algorithms . , but their outputs were less interpretable. Thus, these studies indicate the importance of devising an IDS framework that integrates optimized feature selection and interpretability while helping researchers build a multi-dataset validation tool, which our proposed CyberShieldDL intends to achieve. Table 1. Literature Review Summary of Deep Learning-Based IDS Highlighting Methods. Contributions, and Identified Research Gaps Authors (Re. Method/Technique Key Contribution Research Gap Halbouni et al. CNN-LSTM Hybrid Proposed a hybrid IDS combining spatial and temporal feature extraction for improved network security. There is a limited focus on crossdataset generalization in realworld deployments. Hnamte and Hussain DCNNBiLSTM Developed an efficient deep learning IDS leveraging convolutional and recurrent architectures. Scalability to heterogeneous IoT Lansky et al. Systematic Review Reviewed deep learning architectures IDS, performance and application trends. Lack attack types. Gueriani et al. CNN-LSTM IDS Addressed IoT security using a CNNLSTM model trained on IoT-specific Limited explainability of the IDS decision-making process. Altunay Albayrak . Hybrid CNN LSTM Applied a hybrid IDS approach to industrial IoT networks, enhancing cyberattack detection. Energy deployment at the network edge are unaddressed. Olanrewaju-George and Pranggono . Federated Learning with Supervised Unsupervised Introduced privacy-preserving federated IDS for IoT using combined learning paradigms. Federated model drift and adaptive attack resilience are Berguiga et al. CNN-LSTM Hybrid for IoMT Proposed HIDS-IoMT, a hybrid IDS for medical IoT leveraging fog computing for deployment. Real-time inference latency Al-Shurbaji et al. Literature Review Reviewed DL-based IDS frameworks for detecting IoT botnet attacks. Lack of practical deployment validation in large-scale IoT Xu et al. Hierarchical Hybrid Attention Improved detection using hierarchical attention-based model. Evaluated only on a limited dataset, no generalization or Hassan et al. Hybrid DL for Big Data Scalable design for large network traffic analysis. Lacks explainability and feature Qazi et al. HDLNIDS Hybrid DL Robust architecture combining multiple deep networks. No cross-dataset validation Aldallal . CNN-RNN Hybrid Enhanced efficiency with CNN-RNN No interpretability of decisions. Mayuranathan et al. Cloud-Specific Hybrid DL Optimized for cloud intrusion Limited applicability to other Sajid et al. ML DL Hybrid IDS Combined ML and DL for multi-class intrusion detection. No feature selection, low Sharma et al. Evolutionary ML DL IDS Adaptive framework integrating evolutionary learning. Lacks interpretability and statistical rigor. across diverse Table 1 summarizes selected deep learning-based IDS studies, outlining their methodologies, key contributions to cybersecurity, and the corresponding research gaps. The reviewed articles present diverse deep learning frameworks, including CNN. LSTM, attention mechanisms, and hybrid models, applied to network. IoT, industrial, and vehicular security. Key trends include federated learning, anomaly-signature IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 integration, and semantic-driven detection. These studies highlight evolving attack patterns, emphasizing the need for adaptive, explainable, and context-aware IDS architectures. MATERIALS AND METHODS This section presents the detailed methodology of the proposed CyberShieldDL framework, outlining its architectural design and implementation workflow. The methodology integrates data preprocessing, feature selection, and a hybrid deep learning modelAiIntruNet-HybridAicomprising CNN. Bi-LSTM, and attention These components collaboratively enable efficient spatial-temporal feature learning for accurate intrusion detection across diverse network environments and attack scenarios. 1 Overview of CyberShieldDL System The CyberShieldDL system, to be developed under this project, is a deep learning-based intrusion detection framework that aims to enable intelligent network traffic analysis, leading to real-time threat categorization and improved cybersecurity levels. The system employs a modularized and tandem computing framework, comprising the collection of raw network traffic datasets, data preprocessing, feature selection, deep learning for classification using the IntruNet-Hybrid model, threat labeling, and alert issuance. The primary aim of the proposed system is to achieve robust and efficient intrusion detection by leveraging a hybrid deep learning architecture. Input Network Traffic Data Data Preprocessing Deep Learning Engine: IntruNetHybrid Feature Engineering & Selection 1D-CNN Layer Bi-LSTM Layer Output Layers Intrusion Classification Attention Layer Result Management Alert Generation & Log Storage Figure 1. CyberShieldDL System Architecture Illustrating the End-to-End Workflow with Hybrid Deep Learning-Based Intrusion Detection At the heart of the system is the IntruNet-Hybrid model, which is trained on features extracted from preprocessed flow data. These attributes are obtained once the raw traffic attributes are standardized and coded. We let ycU = . cu1 , ycu2 , . , ycuycu } denote the set of input vectors, where each ycuycn OO Eyycc is a -dimensional feature vector ycc. As input, a one-dimensional convolutional layer is applied to memorize the local spatial patterns in the flow The CNN layer uses filters WW on the input vector to generate feature maps yaycaycuycu . yaycaycuycu = ycIyceyaycO. cO O ycU yc. The feature maps yaycaycuycu are passed to a bidirectional LSTM layer that processes bidirectional sequences to memorize long-range relationships in network behavior. If Eayc represents the hidden state at time yc, then the output from Bi-LSTM is: E yc. Ea EnEyc ] yaycaycnycoycycyco = [Ea . CyberShieldDL: A Hybrid Deep Learning Architecture forA (S. Venkatramulu et a. A ISSN: 2089-3272 To concentrate more on those time steps that are more important for anomaly detection, we adopt the attention The attention layer calculates weights yuyc over all hidden states Eayc , and the context vector ya is the weighted sum: ya = OcycNyc=1 yuyc Eayc The last context vector is fed into fully connected layers with a softmax function to generate prediction probabilities for each class. The prediction ycC OO Eyyco is output by the output layer, where yco is the number of intrusion classes. CyberShieldDL is designed for use in network environments that require real-time integration and data It's capable of batch- and stream-based inference and is modular, facilitating deployment in softwaredefined networking (SDN) or at the edge. The system flow, from receiving the data to intrusion classification and finally storing the alert, is illustrated in Figure 1. Table 2 summarizes the mathematical notations and variables used throughout the methodology, clarifying data representations and model operations. Table 2. Summary of Mathematical Notations and Symbols Used in the Methodology Symbol Description ycuycn Raw input feature vector for instance ycn ycuycnA Normalized feature vector after preprocessing ycuycnO The final selected feature vector after feature selection ycycn Ground truth label for instance ycn ycCycn Predicted probability distribution over classes for instance ycn ycc Original number of features in the dataset yccA Reduced number of features after selection ycA Number of training instances yco Total number of intrusion classes ycOyca , ycayca Weights and bias of the CNN layer yaycaycuycu Output feature map from CNN Eayc Hidden state of the Bi-LSTM at time step yc E yc. Ea EnEyc Ea Forward and backward hidden states in Bi-LSTM yuyc Attention weight assigned to hidden state Eayc ya Context vector from the attention mechanism ycOycu , ycaycu Weights and bias of the output . layer Ee Categorical cross-entropy loss yuC Learning rate for optimizer ycy Dropout probability yuE Set of hyperparameters yuEO Optimized hyperparameter configuration ycI. cA, ya A ) Scoring function used in feature subset evaluation E. uE) Performance metric used in hyperparameter tuning 2 Dataset Acquisition and Preprocessing CyberShieldDL is learnt and tested on benchmark intrusion detection datasets in multiple cyberattack instances and network traffic artefacts. Of these, this work relies on the CIC-IDS2017 dataset, as it encompasses a wide range of benign traffic in addition to various attacks, including DoS. DDoS, infiltration, brute-force, and botnet attacks. Furthermore, we conduct additional validation using the NSL-KDD and UNSW-NB15 datasets to investigate whether the method is domain-independent. Each dataset consists of labeled network flow records, where each sample is described by a connection vector comprising several numerical and categorical attributes. ycc Let ya = {. cuycn , ycycn )}ycA ycn=1 be a raw dataset where ycuycn OO Ey is feature vector and ycycn OO . ,2, . , yc. is the class label representing one of the yco intrusion types. The dataset is preprocessed through a series of operations that enable it to be compatible with the deep learning architecture and increase training speed. IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 The first step is to impute missing values using statistical imputation inference. Binary variables, as protocol type or service, are transformed into numerical ones using one-hot encoding. All features are normalized to a set of standard scales by z-score normalization, which is defined as: ycu OeyuN ycuycnycuycuycyco = ycn yua where yuN and yua denote the mean and standard deviation of coordinate of feature across the dataset. This prevents the gradient descent from being dominated by the size of the input features. To alleviate the class imbalance in intrusion datasets from the real-world domain, the synthetic minority over-sampling technique (SMOTE) is optionally employed. This creates artificial instances for the minority group, thereby helping the classifier learn patterns of the minority class. The last dataset is split into a training set and a test set according to a stratified split that keeps the distribution of each class. For timebased datasets such as CIC-IDS2017, we ensure that temporal consistency is maintained to prevent data Following preprocessing, the preprocessed dataset yaA = {. cuycnA , ycycn )}ycA ycn=1 is fed into feature selection and model building. With this, we can guarantee that the IntruNet-Hybrid has balanced, normalized and clean data to learn from. 3 Feature Selection and Extraction For enhanced learning performance and computational overhead reduction. CyberShieldDL uses a structured and hybrid feature selection method. In intrusion detection data sets the features are too many, with some of the features irrelevant, redundant, or correlated. Thus, a compact and discriminative set of extracted features are of great importance to the performance and generalization ability of the learnt model. ycc We denote yaA = {. cuycnA , ycycn )}ycA ycn=1 by the preprocessed dataset, where each ycuycn OO Ey is a normalized feature vector and ycycn OO . , . , yc. is the corresponding class label. The objective of this step is to compress to with only A the most useful features. The goal of this stage is to reduce ycuycnA to a more compact form ycuycnO OO Eyycc , where ycc A < ycc, by selecting only the most relevant features. Filter-based methods are the first step in this process. All features yceyc are ranked according to their Information Gain (IG), which they obtain by taking into account the feature uncertainty reduction with respect to the target class label: ceyc ) = ya. cU) Oe ya. cU O yceyc ) . Where ya. cU) is the class distribution entropy, and ya. cU O yceyc ) is the conditional entropy for given feature yceyc . Features with larger IG score contribute more information for making the prediction ycycn and are ranked Furthermore. Chi-Squared test is performed in order to see the statistical independence of each feature with the target variable: yue 2 . ceyc ) = Ocycoycn=1 . cCycn Oeyaycn )2 yaycn where ycCycn and yaycn are the observed and expected frequencies of the feature for class ycn. Higher-valued features yue 2 are considered more important. The next step is to use a wrapper method. Recursive Feature Elimination (RFE), to successively prune the feature set. A base model M is incrementally trained on feature subsets by dropping the least significant features at each step and retaining the subset ya O OC ya that optimizes performance on the model: ya O = ycaycyciA max cA, ya A ) A ya OCya Where ycI. cA, ya A ) is the score of model M over feature subset ya A . Where once the best subset ya O is found, each input instance is encoded as: ycuycnO = . cuycnyc O yceyc OO ya O } . This extracted feature vector ycuycnO is then fed into the IntruNet-Hybrid model. This is in contrast to other dimensionality reduction techniques, for example PCA, and it additionally maintains semantic interpretability so that the features which are chosen are not only concise but also interpretable in a context of cybersecurity. This hybrid selection process optimizes a trade-off between accuracy, generalization, and computational efficiency, making it appropriate for real-time intrusion detection. CyberShieldDL: A Hybrid Deep Learning Architecture forA (S. Venkatramulu et a. A ISSN: 2089-3272 4 IntruNet-Hybrid Model Architecture The proposed IntruNet-Hybrid model shown in Figure 2 is the backbone of the CyberShieldDL framework, and is implemented as a hybrid deep learning architecture that combines 1D CNN. Bi-LSTM networks, and attention mechanism. This hybrid structure takes advantage of the merits of both components such that it is capable of performing effective intrusion detection through extracting spatial patterns, sequential dependencies, and context-aware feature weighting from network traffic data. Input Feature vector 1D-CNN Layer 1D Convolutional Layer. MaxPooling Layer Bidirectional LSTM Layer Attention Mechanism Fully Connected Layer. Softmax Output Layer Figure 2. IntruNet-Hybrid Model with CNN. Bi-LSTM, and Attention for Intrusion Classification A Input creates a feature input ycuycnO OO Eyycc . elected feature vector of each traffic instanc. for the model. The first part of the model is a 1D-CNN layer, which filters local spatial patterns over the features. Let ycOyca be the convolution filter and ycayca be the bias. The CNN transformation results in a feature map yaycaycuycu as follows: yaycaycuycu = ycIyceyaycO. cOyca O ycuycnO ycayca ) . Here, the ReLU activation adds non-linearity and aids in learning complex representations. Several filters are employed to learn the different feature patterns, then it is follow by max-pooling . f an. to downsample the input thereby preserving and selecting the most useful information. Output of the CNN layer is fed into a Bi-LSTM layer which can extract forward as well as backward temporal dependencies in the processed feature sequence. For a sequence of CNN-processed vectors E t and backward hidden states Ea EnEyc . These are . ce1 , yce2 , . , yceycN }, the Bi-LSTM produces forward hidden states Ea concatenated to get the context representation at time t: E t. Ea EnEyc ] Eayc = [Ea . An attention mechanism is further introduced to help make model more interpretable and its focus This mechanism gives a relevance score yuyc for each hidden state Eayc , according to its role in the classification C is then calculated as a sum of hidden states weighted by attention: ya = OcycNyc=1 yuyc Eayc This attention-based aggregation enables the model to focus more on informative temporal patterns . eta-rule. , which is crucial when it comes to subtle or slowly changing attack signatures. IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 Then context vector C is propagated through fully connected layers, and to softmax output layer and then to next which yields a probability distribution over k classes as follows: ycC = ycycuyceycycoycaycu. cOycu ya ycaycu ) . where ycOycu and ycaycu are the output layer parameters. The predicted class ycC is either an attack type or normal traffic. The complete IntruNet-Hybrid pipeline provides a spatial, temporal and context-aware fusion of information which leads to a robust detection of known and unknown cyber threats. This architectural model is depicted in Figure 2, which can observe the flows from the input feature vector to the convolutional and recurrent layer, and last to the intrusionAos final classification. In the hybrid scheme, the model has high accuracy, is robust to noise, and can generalize to different network conditions. 5 Model Training and Hyperparameter Optimization The IntruNet-Hybrid architecture in CyberShieldDL is trained according to supervised learning, ,every input feature vector ycuycnO is labeled with a ground-truth label ycycn OO . ,2, . , yc. Training is conducted to learn weights by minimizing the divergence between the predicted probability distribution ycCycn and the true class ycycn . To do this, the model employs the categorical cross-entropy loss, which is ideal for multi-class yco ya = Oe OcycA Cycnyc ) . ycn=1 Ocyc=1 ycycnyc ycoycuyciA. c where M is the number of training examples,Ayco is the number of classes, ycycnyc is a binary indicator . if class yc is not the correct class, otherwise . , ycCycnyc is the predicted probability of class yc. The model parameters are optimized using an Adam optimizer which computes individual learning rates from an exponentially decaying average of past squared gradients. Adam is chosen due to its performance and stability to train deep models with large and unbalance datasets. The starting learning rate is set to empirically yuC = 0. 001, and an exponential decay schedule . o deal with the over-fitting and overcome a little convergence proble. is also optional as standard convergence method during training of the ResNet architecture on other datasets. Several regularization techniques are used to avoid overfitting and enable generalization. Dropout layers are added after CNN and Bi-LSTM layers to randomly drop unit activations during training. Let ycy be the dropout probability, then the activation Ea during the training is transformd: Eayccycycuycy = Ea UI yc, yc O yaAyceycycuycuycycoycoycn. Oe yc. Besides, batch normalization is applied to normalize the intermediate activations, which can speed up convergence and stabilize training. The image model is trained ya for epochs with a mini-batch size yaA, and the training set is shuffled after each epoch to decrease the bias and increase robustness. Training is monitored using a separate validation set and early stopping which stops training when the validation loss does not improve over a given number of epochs ycE. To improve model performance, we optimize over hyperparameters that include number of CNN filters. LSTM units, attention dims, learning rate, and dropout probability using hyperparameter search . ither grid search or Bayesian optimizatio. Given yuE the hyperparameter set, the best configuration yuE O is determined by maximizing a validation performance metric E. uE), such as the F1-score: yuE O = ycaycyciA max E. uE) yuEOOyu Such a systematic training and optimization process can contribute to the model IntruNet-Hybrid being highly accurate and generalizable across various datasets and networks. 6 Algorithmic Implementation This section describes the implementation details of the CyberShieldDL framework and presents a series of algorithmic steps of the various building blocks of the IntruNet-Hybrid model. It details the process flow, which includes data entry, feature extraction, generation of a hybrid model, and final intrusion detection. The algorithm captures the processing flow, lending transparency to the systemAos components that contribute to cross-system intrusion detection. CyberShieldDL: A Hybrid Deep Learning Architecture forA (S. Venkatramulu et a. A ISSN: 2089-3272 Algorithm: Training Procedure for IntruNet-Hybrid Model Input: Preprocessed dataset yaA = {. cuycnA , ycycn )}ycA ycn=1 , learning rate yuC, epochs ya, batch size yaA, dropout rate ycy Output: Trained model parameters yu Initialize model parameters yu For epoch = 1 to ya: Shuffle training dataset Divide data into mini-batches of size yaA For each mini-batch {. cuyca , ycyca )}: Forward propagate ycuyca through CNN Ie Bi-LSTM Ie Attention Ie Dense layers Compute prediction ycCyca using softmax Calculate loss ya using Eq. Apply dropout with probability ycy Backpropagate gradients Update parameters yu Ia yu Oe yuC UI yuyu ya End For Evaluate validation performance If early stopping criteria met, break End For Return yu Algorithm 1: Training Procedure for IntruNet-Hybrid Model Algorithm 1 presents the supervised training of the IntruNet-Hybrid model in the CyberShieldDL It starts by setting the model parameters . he weights and biases for the CNN, the Bi-LSTM, and the attention and dense layer. Input is the preprocessed and feature-selected data set yaA . Iteration is performed in mini-batches, and a shuffled training set is used in each epoch to achieve the stochastic gradient effect and improve convergence. For every mini-batch, input feature vectors are fed through the IntruNet-Hybrid pipeline. First, 1DCNN layers are applied to derive spatial patterns. A Bidirectional LSTM model processes the output feature to learn the forward and backward temporal relationship. Finally, the output sequences are input to an attention layer which uses them to calculate weighted context vectors, focusing in the most informative time steps for The last dense layers output a softmax probability distribution over the pre-specified intrusion The categorical cross-entropy loss is calculated between the predicted labels and the accurate labels by the model. Dropout is used for avoiding overfitting by randomly deactivating neurons during training. Gradients of the loss function are calculated concerning the model parameters and subsequently used to update the parameters using the Adam optimizer. The model performance is assessed with an independent validation set after each epoch. If the validation loss does not decrease for a certain number of epochs, early stopping will be activated to stop the model training and reduce computation. The output of the algorithm is yu, which is deployed on the CyberShieldDL system for live inference. 7 Evaluation Metrics and Validation Strategy We conduct comprehensive evaluations to compare the performance of the CyberShieldDL against its base IntruNet-Hybrid model using standard classification metrics. These measures, in addition to assessing the overall prediction of the classifier, provide indications of the model's performance behavior under different kinds of intrusions with varying degrees of class skewness. The main metric is accuracy, which is the fraction of correct predictions over the number of total Let TP. TN. FP and FN be the true positives, true negatives, false positives and false negatives Accuracy is defined as: yaycaycaycycycaycayc = ycNycE ycNycA ycNycE ycNycA yaycE yaycA However, accuracy is deceptive on its own, especially in the case of imbalanced data. Hence, the precision, recall, and F1-score are also calculated to assess the quality of optimistic predictions and the system's capacity to identify real attacks correctly. These are defined as: ycEycyceycaycnycycnycuycu = ycIyceycaycaycoyco = ycNycE ycNycE yaycE ycNycE ycNycE yaycA ya1 Oe ycIycaycuycyce = 2 UI ycEycyceycaycnycycnycuycuUIycIyceycaycaycoyco ycEycyceycaycnycycnycuycu ycIyceycaycaycoycoA IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 To assess the model performance in distinguishing among multiple types, we employ macroaveraging, in which the metrics are independently calculated for each type before being averaged. A one-vsrest approach is then used to compute ROC-AUC for multi-class classification. The ROC-AUC indicates the true positive-false positive rate compromise for various threshold choices. A confusion matrix is generated at the end to display the distribution of correct and incorrect predictions across all classes. This provides us with a measure of which attack forms are most susceptible to misclassification, suggesting areas where to focus on improving feature representation or model architecture. Stratified 5-fold cross-validation was used to confirm the generalisation of the IntruNet-Hybrid model. At this stage, the dataset is divided into five stratified folds, each with a 20% sample, preserving the class distribution in all folds. At each iteration, four folds are utilized for training and one for testing, which is repeated five times. The performance of the final models is averaged across all folds to control for data This multi-metric and multi-cross-validation evaluation approach helps confirm the trustworthy performance and resilience of the proposed system against various types of intrusion and network states. For improved statistics, all results are reported as mean and standard deviation across five crossvalidation folds. This takes into account both the center and the spread of performance metrics of the model for different trainAetest splits. This in-depth reporting allows more solid assessment of the CyberShieldDL and also shows the robustness of CyberShieldDL in various setup of experiments. EXPERIMENTAL RESULTS In this section, we discuss the experimental results of CyberShieldDL on specific well-known network intrusion datasets. The experiments are conducted to examine the detection ability, generalization capacity, and relative effectiveness of the model, compared with other state-of-the-art methods. The effect of the IntruNet-Hybrid design and the feature selection pipeline optimization is verified based on different evaluation metrics and ablation studies. 1 Experimental Setup The experiments were conducted on a workstation equipped with an Intel Core i9 CPU, 64 GB of RAM, and an NVIDIA RTX 3090 GPU, running Ubuntu 22. 04 LTS. The CyberShieldDL system was implemented using Python 3. TensorFlow 2. 13, and Keras as deep learning frameworks for manipulation. A preprocessing and evaluation library . cikit-learn, panda. was used. To evaluate the effectiveness of our dataset, the CIC-IDS2017 dataset . was chosen as the primary benchmark due to its comprehensive coverage of modern attack types and authentic network traffic. In addition, to test the model's generalization, a few extra experiments were conducted on the NSL-KDD . and UNSWNB15 . We process all datasets as described in Section 4. 2, encoding categorical features and normalizing the numerical ones by Z-score standardization. The dataset was divided into training . %), validation . %), and test . %) using a stratified splitting approach that maintains class distributions. To solve the class imbalance problem, the training dataset is treated with the SMOTE. Grid search was used for hyperparameter tuning of the IntruNet-Hybrid model. We used the following hyperparameters for the final settings: learning rate 0. 001, batch size 128, and dropout rate 0. Early stopping was set to a patience of 10 epochs using the validation loss. Performance comparisons were made with the baseline models: standalone CNN. Bi-LSTM only, and Random Forest classifiers under the same experimental 2 Exploratory Data Analysis This section discusses EDA of the CIC-IDS2017 dataset to understand its inherent structure, feature relations, and class distributions. A variety of visualizations are applied to examine the class imbalance of the counterparts, protocol usage, and feature correlations, as well as the distribution of attack and benign traffic, which can help us understand what motivates the models to learn from these features. CyberShieldDL: A Hybrid Deep Learning Architecture forA (S. Venkatramulu et a. A ISSN: 2089-3272 Figure 3. Exploratory Data Analysis of CIC-IDS2017 Dataset Showing . Class Distribution, . Protocol Type Usage, . Feature Correlations, and . Flow Duration Across Traffic Classes Figure 3 presents the main results from exploratory analyses of the CIC-IDS 2017 dataset. Subfigure . illustrates that benign and attack traffic are class-imbalanced. Subfigure . presents only the protocol category distribution, with TCP flows outnumbering others. Strong and weak correlations between selected features are illustrated in subfigure . Flow duration deviations among various traffic classes are compared in subfigure . , which shows diverse patterns for different attacks. Figure 4. Exploratory Visualizations of CIC-IDS2017 Dataset Showing . Packet Distribution Across Forward and Backward Flows by Traffic Type, and . Flow Bytes per Second Distribution Across Different Traffic Classes Figure 4 illustrates additional exploratory insights from the CIC-IDS2017 dataset. Subfigure . shows the scatter distribution of total forward and backward packets, highlighting distinct clustering patterns across IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 traffic types. Subfigure . presents the flow bytes per second distributions using a violin plot, revealing variability in data transfer rates between benign and attack classes, indicating potential discriminative patterns. 3 Results On CIC-IDS2017 Dataset The performance of the proposed CyberShieldDL system was initially evaluated on the CIC-IDS2017 The IntruNet-Hybrid model was trained and tested using the preprocessed and feature-optimized dataset, with results benchmarked against baseline models including standalone CNN. Bi-LSTM, and Random Forest classifiers. The IntruNet-Hybrid model demonstrated strong performance across all evaluation metrics. The hybrid design effectively captured spatial-temporal patterns in the network flows, resulting in the accurate classification of both frequent and rare attack types. Table 3. Classification Results of CyberShieldDL On CIC-IDS2017 Dataset (Values Reported as Mean A Standard Deviation Across Five Cross-Validation Fold. Class BENIGN DoS PortScan Bot BruteForce Infiltration Web Attack Macro Avg. Overall Accuracy Precision (%) 0 A 0. 8 A 0. 3 A 0. 2 A 0. 6 A 0. 8 A 0. 1 A 0. 4 A 0. Recall (%) 3 A 0. 2 A 0. 6 A 0. 4 A 0. 0 A 0. 0 A 0. 4 A 0. 0 A 0. F1-Score (%) 1 A 0. 5 A 0. 4 A 0. 8 A 0. 3 A 0. 4 A 0. 7 A 0. 2 A 0. Support (Sample. 25,000 8,500 5,200 3,600 4,700 2,100 50,000 35 A 0. Class wise detection results of CyberShieldDL on the CIC-IDS2017 dataset presented as mean A standard deviation over five folds of cross-validation are given in Table 3. The data demonstrates consistently high precision, recall, and F1-scores across benign and attack classes, with an overall accuracy of 98. A0. 42%, indicating to the balanced and robust performance of the intrusion detection system. Figure 5. Class-Wise Performance Metrics of CyberShieldDL On CIC-IDS2017 Dataset with Standard Deviations The class-wise performance of CyberShieldDL on the CICIDS2017 dataset is illustrated in Figure 5 as grouped bars with mean precision, recall, and F1-scores, and class-wise performance per cross-validation fold illustrated by error bars showing the standard deviation . = . The following visualization proves that the model is stable and robust for different categories of intrusion. The results confirm that in this scenario, the model obtained perfect or near-perfect detection in benign traffic, with precision, recall and F1-scores always higher between 99% or higher, which means a very few false alarms. For DoS and PortScan attacks, we also achieve high performance, with F1-scores greater than 97%, confirming the ability of the hybrid CNNAe BiLSTMAeAttention design to accommodate a large amount of common volumetric and probing attacks. CyberShieldDL: A Hybrid Deep Learning Architecture forA (S. Venkatramulu et a. A ISSN: 2089-3272 For the more difficult classes, like Bot and BruteForce, the classification scores are still above 94%, showing the model is able to find even stealthier patterns, despite them being the most variable type in terms of flow behaviour. Infiltration attacks are by nature rare and lead to very subtle traffic signatures, therefore they prove to be a challenging task where CyberShieldDL has slightly lower and competitive F1-scores . A 0. 6%) but shows CyberShieldDL still works reasonably well on minority classes, especially after feature selection and the SMOTE balancing approach is utilized. Web Attacks still exhibit excellent detection performance over 95%, thus indicating well-balanced identification of application-layer attacks. In general the error bars are small across all classes, indicating that the results do not depend strongly on the trainingAetesting splits and are consistent across folds. These results further substantiate that CyberShieldDL consistently achieves high accuracy, balanced performance, and statistical reliability over various targeted attack classes. The comparison is based on overall accuracy, precision, recall, and F1-score . acro-average. , as illustrated in Table 4. Table 4. Performance Comparison of CyberShieldDL with Baseline Models on CIC-IDS2017 Dataset Model Accuracy (%) Precision (%) Recall (%) F1-Score (%) Random Forest CNN Bi-LSTM CyberShieldDL (Propose. The results indicate that CyberShieldDL outperforms all baseline models across the evaluated metrics. The standalone CNN model effectively extracted spatial features but lacked the temporal learning required for sequential attack patterns, resulting in moderate performance. The Bi-LSTM model captured temporal dependencies but underutilized spatial patterns, limiting its classification accuracy. The Random Forest classifier demonstrated competitive results on basic flow features but struggled with complex temporal dynamics, particularly in distinguishing between similar attack classes. The IntruNet-Hybrid model, with its combined CNN. Bi-LSTM, and attention mechanisms, successfully leveraged both spatial and sequential patterns, leading to superior detection capabilities. This hybrid architecture addressed the limitations of individual deep learning models and demonstrated its robustness in detecting both frequent and rare types of attacks. Figure 6. Performance Comparison of CyberShieldDL With Baseline Models on CIC-IDS2017 Dataset Showing . Accuracy, . Precision, . Recall, and . F1-Score Across Different Models IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 As shown in Figure 6, the proposed CyberShieldDL system outperforms the baseline models. Additionally. Figure 4 presents the ROC AUC statistics of the baseline models on the CIC-IDS2017 dataset. Subfigure . shows that CyberShieldDL has the best accuracy compared to CNN. Bi-LSTM, and Random Forest. Subfigure . shows precision scores in which the proposed CyberShieldDL indicates the improved favorable prediction rates for the attack classes. Subfigure . presents recall, demonstrating that CyberShieldDL can stably discover a larger portion of real attacks than the baselines, and thus exhibits better Subfigure . stresses F1-scores displaying CyberShieldDLAos fair performance, as it achieved a good trade-off of precision and recall. For all the metrics. CyberShieldDL outperforms traditional ML methods and deep learning models with a single architecture, indicating that it can accurately learn complex spatiotemporal patterns of network traffic with its hybrid CNN-BiLSTM-Attention framework. The experimental results demonstrate the effectiveness of CyberShieldDL for real-time intrusion detection in heterogeneous cybersecurity systems. Figure 7. Training and Validation Accuracy Dynamics of the IntruNet-Hybrid Model Across Epochs on the CIC-IDS2017 Dataset Figure 7 shows the accuracy on training and validation during epochs with the IntruNet-Hybrid model in the CIC-IDS2017 dataset. The training accuracy continues to increase, while the validation accuracy starts to saturate, which suggests proper training without overfitting. These outcomes indicate that the model can converge properly and perform well in generalizing to other traffic categories at the training stage. Figure 8. Training and Validation Loss Dynamics of the IntruNet-Hybrid Model Across Epochs on the CICIDS2017 Dataset CyberShieldDL: A Hybrid Deep Learning Architecture forA (S. Venkatramulu et a. A ISSN: 2089-3272 The training and validation loss trends of the IntruNet-Hybrid model during training on the CICIDS2017 dataset are shown in Figure 8. Both losses are continuously reduced, demonstrating that our network can learn effectively and converge to the optimal solution. The proximity of training loss to validation loss suggests that the model generalizes successfully and avoids overfitting, demonstrating its effectiveness in recognizing different types of intrusion patterns. 4 Cross-Dataset Validation To assess the generalization ability of the proposed CyberShieldDL system, cross-dataset validation has been employed using two additional benchmark datasets: NSL-KDD and UNSW-NB15. These datasets exhibit different network traffic distributions and attack environments, which can be used for comprehensive model evaluation across multiple datasets, including the key CIC-IDS2017 dataset. The IntruNet-Hybrid model, using the CIC-IDS2017 training data, was directly tested on the unseen NSL-KDD and UNSW-NB15 test sets, respectively, after preprocessing and feature elaboration. This process replicates realistic deployment conditions, where labeled data in a new context may not be available to retrain the model. Table 5. Cross-Dataset Validation Results of CyberShieldDL On NSL-KDD And UNSW-NB15 Datasets (Values Reported as Mean A Standard Deviation Across Five Cross-Validation Fold. Dataset NSL-KDD UNSW-NB15 Accuracy (%) 12 A 0. 45 A 0. Precision (%) 85 A 0. 90 A 0. Recall (%) 30 A 0. 50 A 0. F1-Score (%) 57 A 0. 70 A 0. The cross-dataset validation performance of CyberShieldDL on NSL-KDD and UNSW-NB15 datasets, reported as mean A standard deviation across five cross-validation folds is summarized in Table 5. These results confirm the strong generalization ability, as evidenced by the high overall accuracy . ore than 93%) with balanced precision, recall and F1-scores. These results indicate the adaptability of the IntruNet-Hybrid architecture to varying network environments and attack profiles. Figure 9. Cross-Dataset Validation Performance of CyberShieldDL On NSL-KDD And UNSW-NB15 Datasets with Standard Deviations As you see on Figure 9, the results of CyberShieldDL, are shown in this cross-dataset validation performances on NSL-KDD and UNSW-NB15 dataset which are represented in mean accuracy, precision, recall and F1-scores with the error bars that represent the standard deviation in the five folds The plot visualizes the remarkable generalization capability of the model on different heterogeneous benchmark datasets. CyberShieldDL achieves 94. 12 A 0. 36% of overall accuracy on the NSL-KDD dataset, and precision, recall, and F1-scores higher than 92%, respectively. Such results reflect that even with older and thus simpler datasets, the hybrid CNNAeBiLSTMAeAttention architecture manages to extract discriminative features and can be reproduce with similar performances across the folds. IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 The model remains stable with an accuracy of . 45 A 0. 38%) over the complex UNSW-NB15 dataset which contains up-to-date and advanced attacks types, and equalized performance on others metrics with an F1-score of . 70 A 0. 42%). The two data set situations have relatively tight error bars, verifying that the model is robust and that its performance does not depend too heavily on how data is divided/partitioned. The cross-dataset comparison shows how CyberShieldDL is resilient and generalisable to different traffic distributions, confirming its appropriateness for real-world deployment in intrusion detection where generalisation across environments becomes a necessity. 5 Ablation Study To analyze the contribution of each component in the IntruNet-Hybrid structure, we performed an ablation study on the CIC-IDS2017 dataset. The test systematically tested the performance of the model according to the model complexity, which has been added/removed (CNN layers. Bi-LSTM layers, and First, we evaluated the CN N-only model to examine how the spatial feature extraction by the CNN contributes to its performance. This setup achieved decent accuracy, which allowed it to learn some local However, due to the absence of temporal context, it was unable to learn sequential attacking behaviors. Then, a model consisting only of Bi-LSTM was tested to record the learning of temporal dependencies, but not the spatial patterns. Although better than simply CNN, its lack of convolution layers made the sensitivity of local feature interaction worse. The proposed CNN-BiLSTM hybrid model integrates spatial and temporal learning, effectively enhancing detection performance by fusing complementary features. Superimposing the attention mechanism over this hybrid further enhanced the model's performance, enabling it to focus on specific time steps and features when learning, and thereby classify subtle or rare attacks more effectively. Table 6. Ablation Study Results of IntruNet-Hybrid Model On CIC-IDS2017 Dataset (Values Reported as Mean A Standard Deviation Across Five Cross-Validation Fold. Model Configuration Accuracy (%) Precision (%) Recall (%) F1-Score (%) CNN Only 45 A 0. 20 A 0. 85 A 0. 02 A 0. Bi-LSTM Only 10 A 0. 45 A 0. 90 A 0. 17 A 0. CNN Bi-LSTM (No Attentio. 25 A 0. 50 A 0. 10 A 0. 30 A 0. CNN Bi-LSTM Attention 35 A 0. 40 A 0. 00 A 0. 20 A 0. The ablation study results of the IntruNet-Hybrid model obtained from CIC-IDS2017 are summarized in Table 6 where the values are reported in the format of mean A standard deviation over five cross-validation The results indicate incremental performance improvement from the CNN-only and Bi-LSTM-only baselines to the combined CNNAeBi-LSTM architecture, and the inclusion of attention leads to the maximum accuracy, precision, recall, and F1-scores, confirming its importance. Subplots . of Figure 10 illustrate the ablation study results of IntruNet-Hybrid on the CICIDS2017 dataset in terms of accuracy, precision, recall, and F1-score respectively. Standard deviation is calculated over five cross-validation folds and used to generate error bars which provide insights into the stability of configuration. It is evident in the results that the CNN-only and Bi-LSTM-only configurations produce performance well below the other configurations . ub 94% accuracy, with higher variance across Since CNN Bi-LSTM . ithout Attentio. 25% for the combined models Ai accuracy balanced precision recall and F1-scores Ai the interaction maximum from spatial feature extraction with temporal feature extraction contributes to a complementary architecture. As shown in Table 5, the complete CNN Bi-LSTM Attention ensemble produces the highest outputs of, respectively, 98. 35 A 0. 42% for accuracy, as well as for Hits@. Recall and F1-scores. We also found that our attention mechanism does boost the performance of our model even more by focusing on the most salient features and time steps, resulting in a more robust and interpretable detection of intrusions. The reuse of domestic sub-context across a global structure likely provides robustness, as evidenced by the relatively small error bars for all metrics across the folds . ata points in figure 4, bars A 1 StDe. CyberShieldDL: A Hybrid Deep Learning Architecture forA (S. Venkatramulu et a. A ISSN: 2089-3272 Figure 10. Ablation Study Results of IntruNet-Hybrid On CIC-IDS2017 Dataset with Standard Deviations 6 Comparative Analysis with Existing Methods This section compares the proposed CyberShieldDL framework with the existing deep learning based intrusion detection approaches. Architectural designs of the models and the detection performances are Accuracy, precision, recall, and F1-score of CyberShieldDL are benchmarked using the CICIDS2017 dataset by illustrating the ability to effectively detect various attack patterns, and to achieve cybersecurity resilience. Table 7. Performance Comparison of IntruNet-Hybrid with Existing Intrusion Detection Methods on CICIDS2017 Dataset Model Accuracy (%) Precision (%) Recall (%) F1-Score (%) Du . Ben Said . Salahaldeen Duraibi . Hariharan . IntruNet-Hybrid (Propose. Intrinet-Hybrid is compared with other deep learning IDSs in Table 7. The results show that IntruNetHybrid is more precise, accurate, and has better recall than other methods. This demonstrates the performance of the designed hybrid model in effectively identifying multiple attack patterns and significantly improving the overall ID system performance on CIC-IDS2017. IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 Figure 11. Comparative Performance Analysis of IntruNet-Hybrid and Existing Methods on CIC-IDS2017 Dataset Table 11 provides a comparison of the proposed IntruNet-Hybrid model with the state-of-the-art IDS on the CIC-IDS2017 dataset. Subfigure . exhibits that IntruNetHybrid results in the best AC by achieving a better performance compared to other techniques, such as S. Hariharan. and Salahaldeen Duraibi . For Subfigure . , we can observe that IntruNet-HybridAos precision remains higher, which means fewer false positives compared to competitive algorithms. In subfigure . , the recall of IntruNet-Hybrid exceeds that of the compared works, demonstrating its capability in detecting more actual intrusions. Subfigure . illustrates the F1-scores, considering both precision and recall, and the proposed model consistently achieves high results. We demonstrate that while existing models, including those by J. Du . and S. Hariharan . , can produce competitive results, they do not offer architectural flexibility or thoroughly investigate cross-attack Ben Said . , who achieves inferior performance, probably because they do not cope well with complex traffic patterns. On the contrary, the hybrid CNN-BiLSTM-attention architecture of IntruNetHybrid can effectively learn spatial and temporal features of network traffic and performs reliably in detecting various attack types. This comparative study highlights the pragmatic effectiveness of the proposed system in enhancing the performance of the intrusion detection system. Table 8. Qualitative Comparison of CyberShieldDL with Recent Hybrid Deep Learning Intrusion Detection Approaches Study Method Dataset Key Features Limitations Udurume et al. CNNAeBiLSTM vs ML CICIDS2017 TemporalAe spatial modeling Single dataset, no Xu et al. Hierarchical hybrid attention Hybrid DL for big data Custom UNSWNB15 Multi-model with attention Scalability focus Limited No interpretability, limited evaluation HDLNIDS hybrid DL CNNAeRNN Hybrid DL IDS for cloud ML DL hybrid IDS CICIDS2017 UNSWNB15 UNSWNB15 CICIDS2017 Robust hybrid Efficient Cloud-specific Combines ML and DL cross-dataset No interpretability Evolutionary ML DL IDS CICIDS2017 Adaptive No interpretability Hassan et al. Qazi et al. Aldallal . Mayuranathan et Sajid et al. Sharma Limited No feature selection Comparison With CyberShieldDL CyberShieldDL adds feature cross-dataset validation, interpretability CyberShieldDL tested on 3 datasets, interpretable CyberShieldDL explainability multi-stage feature selection CyberShieldDL validated across CyberShieldDL attention insights CyberShieldDL CyberShieldDL optimized multi-stage feature CyberShieldDL interpretability and statistical CyberShieldDL: A Hybrid Deep Learning Architecture forA (S. Venkatramulu et a. A ISSN: 2089-3272 A brief qualitative comparison of CyberShieldDL and recent hybrid deep learning-based intrusion detection approaches . Ae. is summarized in Table 8. The table compares the methods, datasets, strengths and limitations of previous studies with respect to CyberShieldDL. Although the works that support CNNAe BiLSTMAeAttention and hybrid architectures prove their significance, they are mainly limited to single-dataset evaluations, lack interpretability, or do not perform feature optimization. To bridge these gaps. CyberShieldDL implements multi-stage feature selection, cross-dataset validation, and attention-based interpretability analysis. 7 Attention Interpretability The attention mechanism in the IntruNet-Hybrid model enables the network to assign varying levels of importance to temporal states and feature dimensions, thereby enhancing the interpretability of the decision We provide proof of this by comparing attention weight distributions over sample attack classes in the CIC-IDS2017 dataset. Visualizing average attention scores for selected features (Fig. Interestingly, the comparative dominance of attributes such as flow duration, packet size distribution, and protocol type, which received continuously higher attention weights, further emphasized their importance in differentiating between benign and malicious ones. Temporal segments associated with bursty traffic were also highlighted, further showing that the model is capable of detecting evolving attack behaviors over time windows. Figure 12. Attention Weight Distribution Across Network Features for Representative Attack Classes The results verify that the attention module not only helps to achieve higher detection accuracy, but also provides an interpretable insight into the stimuli from the network while making a classification. Interpretability like this is an essential prerequisite for deployability, since security analysts must have an understanding of why the system flags particular instances of traffic. This would additionally aid network engineers in determining the discriminative characteristics of traffic relative to the type of attack mitigation methods used. DISCUSSION IDS are essential tools for protecting modern network infrastructure from emerging cyber threats. Current IDS methods are mainly based on traditional machine learning or single deep-learning architectures . CNNs or LSTM. Nevertheless, the literature review suggests that these models struggle to generalize well across datasets, have limited capacity to learn meaningful temporal or spatial patterns individually, and exhibit relatively ineffective attention mechanisms to emphasize important attack features. These limitations to the state of the art underscore the need for a hybridized and context-aware IDS architecture. To counteract these difficulties, this work proposes CyberShieldDL, an end-to-end deep learningdriven IDS framework featuring the IntruNet-Hybrid model. A significant difference between our method and state-of-the-art systems is that we employ CNNs for spatial feature extraction. Bi-LSTMs for temporal feature sequencing, and an attention mechanism for dynamically weighting critical flow relations that cause intrusions. IJEEI. Vol. No. September 2025: 645 Ae 667 IJEEI ISSN: 2089-3272 This novel fusion significantly enhances the model's ability to simulate complex attack behaviors in diverse traffic scenarios. The experimental results demonstrate that our CyberShieldDL outperforms classical models and stateof-the-art counterparts in multiple metrics on the CIC-IDS2017 dataset. The high performance in cross-dataset testing of the proposed system further suggests its generalization ability. The ablation study validated the contribution of each architectural component to the overall system performance. With these experimental results, we have demonstrated that our proposed method can overcome the limitations of competing isolated learning-based techniques in the literature. While Udurume et al. and Xu et al. also employ CNN. Bi-LSTM, and attention mechanisms for intrusion detection, their approaches are primarily limited to single-dataset evaluations and lack integration of multi-stage feature selection or interpretability analysis. In contrast. CyberShieldDL addresses these gaps by unifying efficiency, explainability, and cross-dataset generalizability within a single framework. Building on the hybrid architecture and full-featured optimization pipeline, this work presents a scalable, adaptive, and explainable IDS framework that can effectively support real-world network solutions. The benefits include enhanced threat detection accuracy, reduced false positives, and the ability to address rapidly changing attack surfaces, making CyberShieldDL a powerful tool in the enterprise cybersecurity Some specific weaknesses of this study, which may limit its applicability in real-time deployment and to certain types of attacks, are discussed in Section 5. 1 Limitations of the Study This study has several limitations, despite its encouraging findings. First, this evaluation only considered publicly available datasets, which do not necessarily reflect real-world network scenarios with evolving attack patterns. Second, the computational cost of the model, especially during training, may hinder its deployment on edge devices with limited processing resources without optimization. Thirdly, the current method has not been sufficiently verified against adversarial samples or super evasion methods, which might significantly affect its robustness in the presence of enemies. In our future work, we will explore how to integrate real-time streaming data, lightweight model evolution, and model adversarial robustness mechanisms to make PRBD more practical for deployment. The proposed system has not yet been evaluated under realtime network streaming or adversarial intrusion scenarios. These remain as essential directions for future work to assess deployment feasibility and resilience against adversarial attacks. CONCLUSION AND FUTURE SCOPE This paper presented CyberShieldDL, a deep learning-based intrusion detection framework designed to enhance cybersecurity through intelligent spatial-temporal feature learning. The proposed IntruNet-Hybrid model integrates convolutional, recurrent, and attention mechanisms to capture complex network traffic patterns, addressing limitations of existing standalone models. Through comprehensive experimentation on the CIC-IDS2017 dataset and cross-dataset validation on NSL-KDD and UNSW-NB15. CyberShieldDL demonstrated strong detection performance, effective generalization, and robustness against diverse types of The results confirmed that combining spatial and sequential feature learning with dynamic attention improves classification accuracy, reduces false positives, and enhances the detection of both frequent and rare Despite these contributions, the study acknowledges key limitations. The evaluation utilized static benchmark datasets, which limited the model's real-world generalizability, and the computational demands of the model could challenge its deployment on resource-constrained edge devices. Additionally, adversarial resilience remains an open challenge. Future work will address these limitations by extending the framework to real-time intrusion detection environments using streaming network data. Optimization techniques, such as model pruning, quantization, and edge-cloud partitioning, will be explored to enable efficient deployment in edge computing scenarios. Furthermore, adversarial learning and adaptive training mechanisms will be integrated to improve the system's resilience against sophisticated evasion tactics. Additional validation on diverse, real-world enterprise datasets will further strengthen the systemAos practical Overall, the proposed CyberShieldDL framework provides a promising foundation for scalable, adaptive, and intelligent intrusion detection systems, thereby advancing the state of cybersecurity defense mechanisms in heterogeneous and dynamic network environments. References