MedKit: Multi-level feature distillation with knowledge injection for radiology report generation

Zhaoli Su; Hong Song; Yucong Lin; You Wu; Xutao Weng; Zhongxuan Mao; Bowen Liu; Hongxia Yin; Jian Yang

doi:10.1016/j.eswa.2025.129003

MedKit: Multi-level feature distillation with knowledge injection for radiology report generation

Zhaoli Su, Hong Song^*, Yucong Lin, You Wu, Xutao Weng, Zhongxuan Mao, Bowen Liu, Hongxia Yin, Jian Yang

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Radiology report generation automates the creation of clinically accurate and coherent paragraphs from medical images, reducing the heavy burden of report writing for radiologists. However, current research in this field still faces limitations and urgently requires breakthroughs in feature extraction of image knowledge and model fusion. In this paper, we propose a radiology report generation framework, MedKit, that integrates high information density knowledge fusion with multi-level task feature distillation. We leverage knowledge embedding fusion through a knowledge graph to reduce semantic hallucinations. Additionally, by employing feature extraction techniques within a multi-level task feature distillation architecture, comprehensive image feature information is provided for the primary task. For adapting 2D and 3D images, we propose different visual encoders respectively, which address the issue of inconsistent shapes in medical images. Finally, utilizing a multimodal large model framework enables the generated radiology report to closely approximate medical experts’ fluent expression. Our proposed model significantly outperformed the state-of-the-art model in the MIMIC-CXR dataset with a 20.1 % increase in the BLEU-4 score, from 0.134 to 0.161. We also achieved the best result on the private Liver-CT dataset. Our code is available at http://github.com/sujaly/MedKit.

源语言	英语
文章编号	129003
期刊	Expert Systems with Applications
卷	296
DOI	http://doi.org/10.1016/j.eswa.2025.129003
出版状态	已出版 - 15 1月 2026
已对外发布	是

访问文件

10.1016/j.eswa.2025.129003

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{29305dc0eab841cea520528c4492b14f,

title = "MedKit: Multi-level feature distillation with knowledge injection for radiology report generation",

abstract = "Radiology report generation automates the creation of clinically accurate and coherent paragraphs from medical images, reducing the heavy burden of report writing for radiologists. However, current research in this field still faces limitations and urgently requires breakthroughs in feature extraction of image knowledge and model fusion. In this paper, we propose a radiology report generation framework, MedKit, that integrates high information density knowledge fusion with multi-level task feature distillation. We leverage knowledge embedding fusion through a knowledge graph to reduce semantic hallucinations. Additionally, by employing feature extraction techniques within a multi-level task feature distillation architecture, comprehensive image feature information is provided for the primary task. For adapting 2D and 3D images, we propose different visual encoders respectively, which address the issue of inconsistent shapes in medical images. Finally, utilizing a multimodal large model framework enables the generated radiology report to closely approximate medical experts{\textquoteright} fluent expression. Our proposed model significantly outperformed the state-of-the-art model in the MIMIC-CXR dataset with a 20.1 \% increase in the BLEU-4 score, from 0.134 to 0.161. We also achieved the best result on the private Liver-CT dataset. Our code is available at http://github.com/sujaly/MedKit.",

keywords = "High information density knowledge injection, Knowledge distillation, Large language model, Radiology report generation",

author = "Zhaoli Su and Hong Song and Yucong Lin and You Wu and Xutao Weng and Zhongxuan Mao and Bowen Liu and Hongxia Yin and Jian Yang",

note = "Publisher Copyright: {\textcopyright} 2025",

year = "2026",

month = jan,

day = "15",

doi = "10.1016/j.eswa.2025.129003",

language = "English",

volume = "296",

journal = "Expert Systems with Applications",

issn = "0957-4174",

publisher = "Elsevier Ltd.",

}

TY - JOUR

T1 - MedKit

T2 - Multi-level feature distillation with knowledge injection for radiology report generation

AU - Su, Zhaoli

AU - Song, Hong

AU - Lin, Yucong

AU - Wu, You

AU - Weng, Xutao

AU - Mao, Zhongxuan

AU - Liu, Bowen

AU - Yin, Hongxia

AU - Yang, Jian

PY - 2026/1/15

Y1 - 2026/1/15

N2 - Radiology report generation automates the creation of clinically accurate and coherent paragraphs from medical images, reducing the heavy burden of report writing for radiologists. However, current research in this field still faces limitations and urgently requires breakthroughs in feature extraction of image knowledge and model fusion. In this paper, we propose a radiology report generation framework, MedKit, that integrates high information density knowledge fusion with multi-level task feature distillation. We leverage knowledge embedding fusion through a knowledge graph to reduce semantic hallucinations. Additionally, by employing feature extraction techniques within a multi-level task feature distillation architecture, comprehensive image feature information is provided for the primary task. For adapting 2D and 3D images, we propose different visual encoders respectively, which address the issue of inconsistent shapes in medical images. Finally, utilizing a multimodal large model framework enables the generated radiology report to closely approximate medical experts’ fluent expression. Our proposed model significantly outperformed the state-of-the-art model in the MIMIC-CXR dataset with a 20.1 % increase in the BLEU-4 score, from 0.134 to 0.161. We also achieved the best result on the private Liver-CT dataset. Our code is available at http://github.com/sujaly/MedKit.

AB - Radiology report generation automates the creation of clinically accurate and coherent paragraphs from medical images, reducing the heavy burden of report writing for radiologists. However, current research in this field still faces limitations and urgently requires breakthroughs in feature extraction of image knowledge and model fusion. In this paper, we propose a radiology report generation framework, MedKit, that integrates high information density knowledge fusion with multi-level task feature distillation. We leverage knowledge embedding fusion through a knowledge graph to reduce semantic hallucinations. Additionally, by employing feature extraction techniques within a multi-level task feature distillation architecture, comprehensive image feature information is provided for the primary task. For adapting 2D and 3D images, we propose different visual encoders respectively, which address the issue of inconsistent shapes in medical images. Finally, utilizing a multimodal large model framework enables the generated radiology report to closely approximate medical experts’ fluent expression. Our proposed model significantly outperformed the state-of-the-art model in the MIMIC-CXR dataset with a 20.1 % increase in the BLEU-4 score, from 0.134 to 0.161. We also achieved the best result on the private Liver-CT dataset. Our code is available at http://github.com/sujaly/MedKit.

KW - High information density knowledge injection

KW - Knowledge distillation

KW - Large language model

KW - Radiology report generation

UR - http://www.scopus.com/pages/publications/105011509850

U2 - 10.1016/j.eswa.2025.129003

DO - 10.1016/j.eswa.2025.129003

M3 - Article

AN - SCOPUS:105011509850

SN - 0957-4174

VL - 296

JO - Expert Systems with Applications

JF - Expert Systems with Applications

M1 - 129003

ER -

MedKit: Multi-level feature distillation with knowledge injection for radiology report generation

摘要

访问文件

其它文件与链接

指纹

引用此