Curiosity-driven reinforcement learning with graph transformers for decision-making in connected and autonomous vehicles

Qi Liu; Yujie Tang; Xueyuan Li; Kaifeng Wang; Fan Yang; Zirui Li

doi:10.1016/j.trc.2025.105183

Curiosity-driven reinforcement learning with graph transformers for decision-making in connected and autonomous vehicles

Qi Liu, Yujie Tang, Xueyuan Li^*, Kaifeng Wang, Fan Yang, Zirui Li

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Cooperative decision-making technology for connected and autonomous vehicles (CAVs) in mixed autonomy traffic is critical for the advancement of modern intelligent transportation systems. Recently, graph reinforcement learning (GRL) approaches have shown remarkable success in addressing decision-making challenges by leveraging graph-based technologies. However, existing GRL-based research faces substantial challenges in generating accurate feature embeddings to enhance driving policies, thoroughly exploring the driving environment, and efficiently training models. To address these challenges, this paper proposes a graph transformer reinforcement learning method with a distributional curiosity mechanism to improve the feature generation efficiency and environment exploration, ultimately boosting the decision-making performance of CAVs. First, an improved transformed graph convolutional network (ITransGCN) is proposed, integrating graph convolutional network (GCN), rotary position encoding method (ROPE), and temporal prior attention mechanism to strengthen sequential modeling capabilities, thereby generating informative spatial–temporal feature embeddings. Then, a curiosity mechanism based on distributional random network distillation (DRND) is proposed to enhance the exploratory capabilities of CAVs in driving environments. Additionally, a temporal integrated deep reinforcement learning (TI-DRL) model is developed, incorporating an auxiliary loss that integrates spatial–temporal information to improve the model's ability to capture the spatial–temporal dependencies. Finally, a cooperation-aware reward function is constructed to further evaluate the performance of CAVs. Comprehensive experiments are conducted across three representative traffic scenarios to validate the proposed method. The results demonstrate that our proposed method outperforms the baselines in driving safety, efficiency, and model stability, highlighting the effectiveness of the core components and the generalization capability of the proposed method.

源语言	英语
文章编号	105183
期刊	Transportation Research Part C: Emerging Technologies
卷	177
DOI	http://doi.org/10.1016/j.trc.2025.105183
出版状态	已出版 - 8月 2025
已对外发布	是

访问文件

10.1016/j.trc.2025.105183

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{3d51833b306a40a08d023e08df015552,

title = "Curiosity-driven reinforcement learning with graph transformers for decision-making in connected and autonomous vehicles",

abstract = "Cooperative decision-making technology for connected and autonomous vehicles (CAVs) in mixed autonomy traffic is critical for the advancement of modern intelligent transportation systems. Recently, graph reinforcement learning (GRL) approaches have shown remarkable success in addressing decision-making challenges by leveraging graph-based technologies. However, existing GRL-based research faces substantial challenges in generating accurate feature embeddings to enhance driving policies, thoroughly exploring the driving environment, and efficiently training models. To address these challenges, this paper proposes a graph transformer reinforcement learning method with a distributional curiosity mechanism to improve the feature generation efficiency and environment exploration, ultimately boosting the decision-making performance of CAVs. First, an improved transformed graph convolutional network (ITransGCN) is proposed, integrating graph convolutional network (GCN), rotary position encoding method (ROPE), and temporal prior attention mechanism to strengthen sequential modeling capabilities, thereby generating informative spatial–temporal feature embeddings. Then, a curiosity mechanism based on distributional random network distillation (DRND) is proposed to enhance the exploratory capabilities of CAVs in driving environments. Additionally, a temporal integrated deep reinforcement learning (TI-DRL) model is developed, incorporating an auxiliary loss that integrates spatial–temporal information to improve the model's ability to capture the spatial–temporal dependencies. Finally, a cooperation-aware reward function is constructed to further evaluate the performance of CAVs. Comprehensive experiments are conducted across three representative traffic scenarios to validate the proposed method. The results demonstrate that our proposed method outperforms the baselines in driving safety, efficiency, and model stability, highlighting the effectiveness of the core components and the generalization capability of the proposed method.",

keywords = "Connected and autonomous vehicles, Curiosity mechanism, Decision-making, Graph reinforcement learning, Transformer",

author = "Qi Liu and Yujie Tang and Xueyuan Li and Kaifeng Wang and Fan Yang and Zirui Li",

note = "Publisher Copyright: {\textcopyright} 2025 Elsevier Ltd",

year = "2025",

month = aug,

doi = "10.1016/j.trc.2025.105183",

language = "English",

volume = "177",

journal = "Transportation Research Part C: Emerging Technologies",

issn = "0968-090X",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Curiosity-driven reinforcement learning with graph transformers for decision-making in connected and autonomous vehicles

AU - Liu, Qi

AU - Tang, Yujie

AU - Li, Xueyuan

AU - Wang, Kaifeng

AU - Yang, Fan

AU - Li, Zirui

PY - 2025/8

Y1 - 2025/8

N2 - Cooperative decision-making technology for connected and autonomous vehicles (CAVs) in mixed autonomy traffic is critical for the advancement of modern intelligent transportation systems. Recently, graph reinforcement learning (GRL) approaches have shown remarkable success in addressing decision-making challenges by leveraging graph-based technologies. However, existing GRL-based research faces substantial challenges in generating accurate feature embeddings to enhance driving policies, thoroughly exploring the driving environment, and efficiently training models. To address these challenges, this paper proposes a graph transformer reinforcement learning method with a distributional curiosity mechanism to improve the feature generation efficiency and environment exploration, ultimately boosting the decision-making performance of CAVs. First, an improved transformed graph convolutional network (ITransGCN) is proposed, integrating graph convolutional network (GCN), rotary position encoding method (ROPE), and temporal prior attention mechanism to strengthen sequential modeling capabilities, thereby generating informative spatial–temporal feature embeddings. Then, a curiosity mechanism based on distributional random network distillation (DRND) is proposed to enhance the exploratory capabilities of CAVs in driving environments. Additionally, a temporal integrated deep reinforcement learning (TI-DRL) model is developed, incorporating an auxiliary loss that integrates spatial–temporal information to improve the model's ability to capture the spatial–temporal dependencies. Finally, a cooperation-aware reward function is constructed to further evaluate the performance of CAVs. Comprehensive experiments are conducted across three representative traffic scenarios to validate the proposed method. The results demonstrate that our proposed method outperforms the baselines in driving safety, efficiency, and model stability, highlighting the effectiveness of the core components and the generalization capability of the proposed method.

AB - Cooperative decision-making technology for connected and autonomous vehicles (CAVs) in mixed autonomy traffic is critical for the advancement of modern intelligent transportation systems. Recently, graph reinforcement learning (GRL) approaches have shown remarkable success in addressing decision-making challenges by leveraging graph-based technologies. However, existing GRL-based research faces substantial challenges in generating accurate feature embeddings to enhance driving policies, thoroughly exploring the driving environment, and efficiently training models. To address these challenges, this paper proposes a graph transformer reinforcement learning method with a distributional curiosity mechanism to improve the feature generation efficiency and environment exploration, ultimately boosting the decision-making performance of CAVs. First, an improved transformed graph convolutional network (ITransGCN) is proposed, integrating graph convolutional network (GCN), rotary position encoding method (ROPE), and temporal prior attention mechanism to strengthen sequential modeling capabilities, thereby generating informative spatial–temporal feature embeddings. Then, a curiosity mechanism based on distributional random network distillation (DRND) is proposed to enhance the exploratory capabilities of CAVs in driving environments. Additionally, a temporal integrated deep reinforcement learning (TI-DRL) model is developed, incorporating an auxiliary loss that integrates spatial–temporal information to improve the model's ability to capture the spatial–temporal dependencies. Finally, a cooperation-aware reward function is constructed to further evaluate the performance of CAVs. Comprehensive experiments are conducted across three representative traffic scenarios to validate the proposed method. The results demonstrate that our proposed method outperforms the baselines in driving safety, efficiency, and model stability, highlighting the effectiveness of the core components and the generalization capability of the proposed method.

KW - Connected and autonomous vehicles

KW - Curiosity mechanism

KW - Decision-making

KW - Graph reinforcement learning

KW - Transformer

UR - http://www.scopus.com/pages/publications/105007087994

U2 - 10.1016/j.trc.2025.105183

DO - 10.1016/j.trc.2025.105183

M3 - Article

AN - SCOPUS:105007087994

SN - 0968-090X

VL - 177

JO - Transportation Research Part C: Emerging Technologies

JF - Transportation Research Part C: Emerging Technologies

M1 - 105183

ER -

Curiosity-driven reinforcement learning with graph transformers for decision-making in connected and autonomous vehicles

摘要

访问文件

其它文件与链接

指纹

引用此