Nguyen, VQ, Ngo, LT, Nguyen, LM, Nguyen, VH and Shone, N (2024) Deep clustering hierarchical latent representation for anomaly-based cyber-attack detection. Knowledge-Based Systems, 301. pp. 1-20. ISSN 0950-7051
Text
Deep Clustering Hierarchical Latent Representation for anomaly based cyber attack detection.pdf - Accepted Version Restricted to Repository staff only until 10 August 2025. Available under License Creative Commons Attribution Non-commercial No Derivatives. Download (9MB) |
Abstract
In the field of anomaly detection, well-known techniques and state-of-the-art models often face challenges when interpreting the latent space, which hinders their behavioral classification accuracy. Firstly, the sub-optimal distribution of data points within the latent space makes normal behavioral regions verbose and indistinguishable from abnormal regions. Secondly, within the latent space, it can be difficult to identify meaningful, separable, and indicative features. Finally, the processing time at the inference stage is still relatively slow. This paper aims to improve the accuracy of network anomaly detection mechanisms by proposing two novel deep hierarchical representation learning models: Deep Nested Clustering Auto-Encoder (DNCAE) and Deep Clustering Hierarchical Auto-Encoder (DCHAE). Both models adopt a nested branch structure, utilizing dual deep auto-encoders to establish hierarchical latent spaces; in each, clustering algorithms are used to spatially optimize and refine the data points. This approach results in improved separation between normal and abnormal data points, and easier identification of notable and/or indicative features. To ascertain the effectiveness of the approach and the quality of resulting features, both models were used in conjunction with ten different one-class anomaly detectors. Each of these ten anomaly detectors was evaluated on popular network intrusion datasets, notably: NSL-KDD, UNSW-NB15, CIC-IDS-2017, CSE-CIC-IDS-2018, and CTU13. Experimental results have confirmed that both of the proposed models produced higher levels of accuracy than existing baselines and current state-of-the-art models. Additionally, the processing time at the inference stage shows a significant reduction.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Latent representation; Deep learning; Cyber-attack detection; Anomaly detection; Deep clustering; 08 Information and Computing Sciences; 15 Commerce, Management, Tourism and Services; 17 Psychology and Cognitive Sciences; Artificial Intelligence & Image Processing |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software |
Divisions: | Computer Science and Mathematics |
Publisher: | Elsevier BV |
SWORD Depositor: | A Symplectic |
Date Deposited: | 18 Oct 2024 13:26 |
Last Modified: | 18 Oct 2024 13:30 |
DOI or ID number: | 10.1016/j.knosys.2024.112366 |
URI: | https://researchonline.ljmu.ac.uk/id/eprint/24542 |
View Item |