Anatomy-guided slice-description interaction for multimodal brain disease diagnosis based on 3D image and radiological report

Xin Gao; Meihui Zhang; Junjie Li; Shanbo Zhao; Zhizheng Zhuo; Liying Qu; Jinyuan Weng; Li Chai; Yunyun Duan; Chuyang Ye; Yaou Liu

doi:10.1016/j.compmedimag.2025.102556

Anatomy-guided slice-description interaction for multimodal brain disease diagnosis based on 3D image and radiological report

Xin Gao, Meihui Zhang^*, Junjie Li, Shanbo Zhao, Zhizheng Zhuo, Liying Qu, Jinyuan Weng, Li Chai, Yunyun Duan, Chuyang Ye, Yaou Liu

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Accurate brain disease diagnosis based on radiological images is desired in clinical practice as it can facilitate early intervention and reduce the risk of damage. However, existing unimodal image-based models struggle to process high-dimensional 3D brain imaging data effectively. Multimodal disease diagnosis approaches based on medical images and corresponding radiological reports achieved promising progress with the development of vision-language models. However, most multimodal methods handle 2D images and cannot be directly applied to brain disease diagnosis that uses 3D images. Therefore, in this work we develop a multimodal brain disease diagnosis model that takes 3D brain images and their radiological reports as input. Motivated by the fact that radiologists scroll through image slices and write important descriptions into the report accordingly, we propose a slice-description cross-modality interaction mechanism to realize fine-grained multimodal data interaction. Moreover, since previous medical research has demonstrated potential correlation between anatomical location of anomalies and diagnosis results, we further explore the use of brain anatomical prior knowledge to improve the multimodal interaction. Based on the report description, the prior knowledge filters the image information by suppressing irrelevant regions and enhancing relevant slices. Our method was validated with two brain disease diagnosis tasks. The results indicate that our model outperforms competing unimodal and multimodal methods for brain disease diagnosis. In particular, it has yielded an average accuracy improvement of 15.87% and 7.39% compared with the image-based and multimodal competing methods, respectively.

源语言	英语
文章编号	102556
期刊	Computerized Medical Imaging and Graphics
卷	123
DOI	http://doi.org/10.1016/j.compmedimag.2025.102556
出版状态	已出版 - 7月 2025
已对外发布	是

访问文件

10.1016/j.compmedimag.2025.102556

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{73c6244fa8224905a3d71a5b5f0843bf,

title = "Anatomy-guided slice-description interaction for multimodal brain disease diagnosis based on 3D image and radiological report",

abstract = "Accurate brain disease diagnosis based on radiological images is desired in clinical practice as it can facilitate early intervention and reduce the risk of damage. However, existing unimodal image-based models struggle to process high-dimensional 3D brain imaging data effectively. Multimodal disease diagnosis approaches based on medical images and corresponding radiological reports achieved promising progress with the development of vision-language models. However, most multimodal methods handle 2D images and cannot be directly applied to brain disease diagnosis that uses 3D images. Therefore, in this work we develop a multimodal brain disease diagnosis model that takes 3D brain images and their radiological reports as input. Motivated by the fact that radiologists scroll through image slices and write important descriptions into the report accordingly, we propose a slice-description cross-modality interaction mechanism to realize fine-grained multimodal data interaction. Moreover, since previous medical research has demonstrated potential correlation between anatomical location of anomalies and diagnosis results, we further explore the use of brain anatomical prior knowledge to improve the multimodal interaction. Based on the report description, the prior knowledge filters the image information by suppressing irrelevant regions and enhancing relevant slices. Our method was validated with two brain disease diagnosis tasks. The results indicate that our model outperforms competing unimodal and multimodal methods for brain disease diagnosis. In particular, it has yielded an average accuracy improvement of 15.87\% and 7.39\% compared with the image-based and multimodal competing methods, respectively.",

keywords = "3D brain image, Anatomical prior, Brain disease, Multimodal diagnosis, Radiological report",

author = "Xin Gao and Meihui Zhang and Junjie Li and Shanbo Zhao and Zhizheng Zhuo and Liying Qu and Jinyuan Weng and Li Chai and Yunyun Duan and Chuyang Ye and Yaou Liu",

note = "Publisher Copyright: {\textcopyright} 2025 Elsevier Ltd",

year = "2025",

month = jul,

doi = "10.1016/j.compmedimag.2025.102556",

language = "English",

volume = "123",

journal = "Computerized Medical Imaging and Graphics",

issn = "0895-6111",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - Anatomy-guided slice-description interaction for multimodal brain disease diagnosis based on 3D image and radiological report

AU - Gao, Xin

AU - Zhang, Meihui

AU - Li, Junjie

AU - Zhao, Shanbo

AU - Zhuo, Zhizheng

AU - Qu, Liying

AU - Weng, Jinyuan

AU - Chai, Li

AU - Duan, Yunyun

AU - Ye, Chuyang

AU - Liu, Yaou

PY - 2025/7

Y1 - 2025/7

N2 - Accurate brain disease diagnosis based on radiological images is desired in clinical practice as it can facilitate early intervention and reduce the risk of damage. However, existing unimodal image-based models struggle to process high-dimensional 3D brain imaging data effectively. Multimodal disease diagnosis approaches based on medical images and corresponding radiological reports achieved promising progress with the development of vision-language models. However, most multimodal methods handle 2D images and cannot be directly applied to brain disease diagnosis that uses 3D images. Therefore, in this work we develop a multimodal brain disease diagnosis model that takes 3D brain images and their radiological reports as input. Motivated by the fact that radiologists scroll through image slices and write important descriptions into the report accordingly, we propose a slice-description cross-modality interaction mechanism to realize fine-grained multimodal data interaction. Moreover, since previous medical research has demonstrated potential correlation between anatomical location of anomalies and diagnosis results, we further explore the use of brain anatomical prior knowledge to improve the multimodal interaction. Based on the report description, the prior knowledge filters the image information by suppressing irrelevant regions and enhancing relevant slices. Our method was validated with two brain disease diagnosis tasks. The results indicate that our model outperforms competing unimodal and multimodal methods for brain disease diagnosis. In particular, it has yielded an average accuracy improvement of 15.87% and 7.39% compared with the image-based and multimodal competing methods, respectively.

AB - Accurate brain disease diagnosis based on radiological images is desired in clinical practice as it can facilitate early intervention and reduce the risk of damage. However, existing unimodal image-based models struggle to process high-dimensional 3D brain imaging data effectively. Multimodal disease diagnosis approaches based on medical images and corresponding radiological reports achieved promising progress with the development of vision-language models. However, most multimodal methods handle 2D images and cannot be directly applied to brain disease diagnosis that uses 3D images. Therefore, in this work we develop a multimodal brain disease diagnosis model that takes 3D brain images and their radiological reports as input. Motivated by the fact that radiologists scroll through image slices and write important descriptions into the report accordingly, we propose a slice-description cross-modality interaction mechanism to realize fine-grained multimodal data interaction. Moreover, since previous medical research has demonstrated potential correlation between anatomical location of anomalies and diagnosis results, we further explore the use of brain anatomical prior knowledge to improve the multimodal interaction. Based on the report description, the prior knowledge filters the image information by suppressing irrelevant regions and enhancing relevant slices. Our method was validated with two brain disease diagnosis tasks. The results indicate that our model outperforms competing unimodal and multimodal methods for brain disease diagnosis. In particular, it has yielded an average accuracy improvement of 15.87% and 7.39% compared with the image-based and multimodal competing methods, respectively.

KW - 3D brain image

KW - Anatomical prior

KW - Brain disease

KW - Multimodal diagnosis

KW - Radiological report

UR - http://www.scopus.com/pages/publications/105003740793

U2 - 10.1016/j.compmedimag.2025.102556

DO - 10.1016/j.compmedimag.2025.102556

M3 - Article

AN - SCOPUS:105003740793

SN - 0895-6111

VL - 123

JO - Computerized Medical Imaging and Graphics

JF - Computerized Medical Imaging and Graphics

M1 - 102556

ER -

Anatomy-guided slice-description interaction for multimodal brain disease diagnosis based on 3D image and radiological report

摘要

访问文件

其它文件与链接

指纹

引用此