A novel host-based intrusion detection approach leveraging audit logs

Jiaqing Jiang; Hongyang Chu; Donghai Tian

doi:10.1016/j.future.2025.107995

A novel host-based intrusion detection approach leveraging audit logs

Jiaqing Jiang, Hongyang Chu, Donghai Tian^*

^*此作品的通讯作者

计算机学院

Beijing Institute of Technology

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Host-based intrusion detection systems (HIDS) struggle to detect advanced cyber attacks (e.g., APT, LoTL) due to their stealthy nature and reliance on either structural or semantic features alone. We hypothesize that integrating semantic audit log analysis with structural provenance graph learning improves detection accuracy and adaptability. To validate this, we propose MalSnif, a novel framework that (1) parses audit logs to construct provenance graphs enriched with process/event relationships, (2) simplifies graphs by pruning peripheral nodes while retaining critical attack trajectories, and (3) employs NLP techniques (word2vec, GRU, BiLSTM) to extract semantic features, combined with a graph convolutional network (GCN) for detection. Implemented using PyTorch and ETW, MalSnif addresses data imbalance via strategic downsampling during training. Evaluations show that our approach can effectively detect different kinds of cyber attacks and outperforms recent methods. In addition, our methods for simplifying process event sequences and provenance graphs also yield effective and explainable results.

源语言	英语
文章编号	107995
期刊	Future Generation Computer Systems
卷	174
DOI	http://doi.org/10.1016/j.future.2025.107995
出版状态	已出版 - 1月 2026

访问文件

10.1016/j.future.2025.107995

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{01fed06d10d94dcc82cab9585ab4c72f,

title = "A novel host-based intrusion detection approach leveraging audit logs",

abstract = "Host-based intrusion detection systems (HIDS) struggle to detect advanced cyber attacks (e.g., APT, LoTL) due to their stealthy nature and reliance on either structural or semantic features alone. We hypothesize that integrating semantic audit log analysis with structural provenance graph learning improves detection accuracy and adaptability. To validate this, we propose MalSnif, a novel framework that (1) parses audit logs to construct provenance graphs enriched with process/event relationships, (2) simplifies graphs by pruning peripheral nodes while retaining critical attack trajectories, and (3) employs NLP techniques (word2vec, GRU, BiLSTM) to extract semantic features, combined with a graph convolutional network (GCN) for detection. Implemented using PyTorch and ETW, MalSnif addresses data imbalance via strategic downsampling during training. Evaluations show that our approach can effectively detect different kinds of cyber attacks and outperforms recent methods. In addition, our methods for simplifying process event sequences and provenance graphs also yield effective and explainable results.",

keywords = "Audit log analysis, Graph neural network, Provenance graph, Semantic-structural fusion",

author = "Jiaqing Jiang and Hongyang Chu and Donghai Tian",

note = "Publisher Copyright: {\textcopyright} 2025 Elsevier B.V.",

year = "2026",

month = jan,

doi = "10.1016/j.future.2025.107995",

language = "English",

volume = "174",

journal = "Future Generation Computer Systems",

issn = "0167-739X",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - A novel host-based intrusion detection approach leveraging audit logs

AU - Jiang, Jiaqing

AU - Chu, Hongyang

AU - Tian, Donghai

PY - 2026/1

Y1 - 2026/1

N2 - Host-based intrusion detection systems (HIDS) struggle to detect advanced cyber attacks (e.g., APT, LoTL) due to their stealthy nature and reliance on either structural or semantic features alone. We hypothesize that integrating semantic audit log analysis with structural provenance graph learning improves detection accuracy and adaptability. To validate this, we propose MalSnif, a novel framework that (1) parses audit logs to construct provenance graphs enriched with process/event relationships, (2) simplifies graphs by pruning peripheral nodes while retaining critical attack trajectories, and (3) employs NLP techniques (word2vec, GRU, BiLSTM) to extract semantic features, combined with a graph convolutional network (GCN) for detection. Implemented using PyTorch and ETW, MalSnif addresses data imbalance via strategic downsampling during training. Evaluations show that our approach can effectively detect different kinds of cyber attacks and outperforms recent methods. In addition, our methods for simplifying process event sequences and provenance graphs also yield effective and explainable results.

AB - Host-based intrusion detection systems (HIDS) struggle to detect advanced cyber attacks (e.g., APT, LoTL) due to their stealthy nature and reliance on either structural or semantic features alone. We hypothesize that integrating semantic audit log analysis with structural provenance graph learning improves detection accuracy and adaptability. To validate this, we propose MalSnif, a novel framework that (1) parses audit logs to construct provenance graphs enriched with process/event relationships, (2) simplifies graphs by pruning peripheral nodes while retaining critical attack trajectories, and (3) employs NLP techniques (word2vec, GRU, BiLSTM) to extract semantic features, combined with a graph convolutional network (GCN) for detection. Implemented using PyTorch and ETW, MalSnif addresses data imbalance via strategic downsampling during training. Evaluations show that our approach can effectively detect different kinds of cyber attacks and outperforms recent methods. In addition, our methods for simplifying process event sequences and provenance graphs also yield effective and explainable results.

KW - Audit log analysis

KW - Graph neural network

KW - Provenance graph

KW - Semantic-structural fusion

UR - http://www.scopus.com/pages/publications/105011250539

U2 - 10.1016/j.future.2025.107995

DO - 10.1016/j.future.2025.107995

M3 - Article

AN - SCOPUS:105011250539

SN - 0167-739X

VL - 174

JO - Future Generation Computer Systems

JF - Future Generation Computer Systems

M1 - 107995

ER -

A novel host-based intrusion detection approach leveraging audit logs

摘要

访问文件

其它文件与链接

指纹

引用此