UMD-Net: A Unified Multi-Task Assistive Driving Network Based on Multimodal Fusion

Wenzhuo Liu, Yicheng Qiao, Zhiwei Li, Wenshuo Wang*, Wei Zhang, Jiayin Zhu, Yanhuan Jiang*, Li Wang, Hong Wang, Huaping Liu, Kunfeng Wang

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

1 引用 (Scopus)

摘要

In recent years, researchers have focused on identifying tasks related to driver state, traffic environment, and others to enhance the safety of autonomous driving assistance systems. However, current research on these tasks is conducted independently, neglecting the interconnections between the driver, traffic environment, and vehicle. In this paper, we propose a Unified Multi-task Assistive Driving Network Based on Multimodal Fusion (UMD-Net), the first unified model capable of recognizing four tasks simultaneously by utilizing multimodal data: driver behavior recognition, driver emotion recognition, traffic context recognition, and vehicle behavior recognition. In order to better enhance the synergistic effects between multiple tasks, we designed the position-sensitive multi-directional attention feature extraction subnetwork and recursive dynamic feature fusion module. The former captures the key features of multi-view images by different directions of attention mechanism to improve the generalization of the model across multiple tasks. The latter dynamically adjusts the fusion weight according to the multimodal features to enhance the representation ability of important features in multi-task learning. Our model was evaluated on the public dataset AIDE, achieving the best performance across all four tasks and a high accuracy of 95.31% in the traffic context recognition task, demonstrating the superiority of our approach.

指纹

探究 'UMD-Net: A Unified Multi-Task Assistive Driving Network Based on Multimodal Fusion' 的科研主题。它们共同构成独一无二的指纹。

引用此