TY - JOUR
T1 - Anomaly Detection for Advanced Persistent Threats With Graph Node Embedding
AU - Peng, Zhe Heng
AU - Hu, Chang Zhen
AU - Shan, Chun
N1 - Publisher Copyright:
© 2025, Institute of Information Science. All rights reserved.
PY - 2025/5
Y1 - 2025/5
N2 - In recent years, Advanced Persistent Threat (APT) attacks have increasingly become a menace to national cybersecurity. Due to their complex tactics and persistent nature, traditional anomaly detection methods make it difficult to detect APT attacks effectively. The provenance graph is now widely adopted for APT attack analysis because it possesses greater semantic expression, provenance, and causation abilities. However, many current anomaly detection methods, grounded in provenance graphs and network attack knowledge bases, face inherent complexities in design. Moreover, these methods mainly harness features from the entire provenance graph and overlook the rich semantic intricacies within its architecture, which diminishes their efficacy in spotting anomalous nodes. This research introduces an innovative anomaly detection method for provenance graphs, utilizing heterogeneous graph node embedding and clustering analysis. Drawing from the W3CPROV’s PROV-DM model, we craft a distinct heterogeneous graph structure. We design a new meta-path strategy for better semantic understanding. By employing a heterogeneous graph learning algorithm, we obtain node embeddings. We use K-means clustering to classify benign nodes to get multiple clusters, and then use the benign node clusters to accurately differentiate between benign and anomalous nodes. Experimental validations on the Unicorn SC-2 dataset and the DARPA TC dataset confirm that our approach has better anomaly detection capacity compared to two current anomaly detection systems.
AB - In recent years, Advanced Persistent Threat (APT) attacks have increasingly become a menace to national cybersecurity. Due to their complex tactics and persistent nature, traditional anomaly detection methods make it difficult to detect APT attacks effectively. The provenance graph is now widely adopted for APT attack analysis because it possesses greater semantic expression, provenance, and causation abilities. However, many current anomaly detection methods, grounded in provenance graphs and network attack knowledge bases, face inherent complexities in design. Moreover, these methods mainly harness features from the entire provenance graph and overlook the rich semantic intricacies within its architecture, which diminishes their efficacy in spotting anomalous nodes. This research introduces an innovative anomaly detection method for provenance graphs, utilizing heterogeneous graph node embedding and clustering analysis. Drawing from the W3CPROV’s PROV-DM model, we craft a distinct heterogeneous graph structure. We design a new meta-path strategy for better semantic understanding. By employing a heterogeneous graph learning algorithm, we obtain node embeddings. We use K-means clustering to classify benign nodes to get multiple clusters, and then use the benign node clusters to accurately differentiate between benign and anomalous nodes. Experimental validations on the Unicorn SC-2 dataset and the DARPA TC dataset confirm that our approach has better anomaly detection capacity compared to two current anomaly detection systems.
KW - anomaly detection
KW - cluster analysis
KW - graph node embedding
KW - heterogeneous graph neural network
KW - provenance graph
UR - http://www.scopus.com/pages/publications/105010355804
U2 - 10.6688/JISE.202505_41(3).0012
DO - 10.6688/JISE.202505_41(3).0012
M3 - Article
AN - SCOPUS:105010355804
SN - 1016-2364
VL - 41
SP - 713
EP - 728
JO - Journal of Information Science and Engineering
JF - Journal of Information Science and Engineering
IS - 3
ER -