A novel multimodal computer-aided diagnostic model for pulmonary embolism based on hybrid transformer-CNN and tabular transformer

Wei Zhang, Yu Gu*, Hao Ma, Lidong Yang, Baohua Zhang, Jing Wang, Meng Chen, Xiaoqi Lu, Jianjun Li, Xin Liu, Dahua Yu, Ying Zhao, Siyuan Tang, Qun He

*此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Pulmonary embolism (PE) is a life-threatening clinical problem where early diagnosis and prompt treatment are essential to reducing morbidity and mortality. While the combination of CT images and electronic health records (EHR) can help improve computer-aided diagnosis, there are many challenges that need to be addressed. The primary objective of this study is to leverage both 3D CT images and EHR data to improve PE diagnosis. First, for 3D CT images, we propose a network combining Swin Transformers with 3D CNNs, enhanced by a Multi-Scale Feature Fusion (MSFF) module to address fusion challenges between different encoders. Secondly, we introduce a Polarized Self-Attention (PSA) module to enhance the attention mechanism within the 3D CNN. And then, for EHR data, we design the Tabular Transformer for effective feature extraction. Finally, we design and evaluate three multimodal attention fusion modules to integrate CT and EHR features, selecting the most effective one for final fusion. Experimental results on the RadFusion dataset demonstrate that our model significantly outperforms existing state-of-the-art methods, achieving an AUROC of 0.971, an F1 score of 0.926, and an accuracy of 0.920. These results underscore the effectiveness and innovation of our multimodal approach in advancing PE diagnosis.

源语言英语
期刊Physical and Engineering Sciences in Medicine
DOI
出版状态已接受/待刊 - 2025
已对外发布

指纹

探究 'A novel multimodal computer-aided diagnostic model for pulmonary embolism based on hybrid transformer-CNN and tabular transformer' 的科研主题。它们共同构成独一无二的指纹。

引用此