TY - JOUR
T1 - MedKit
T2 - Multi-level feature distillation with knowledge injection for radiology report generation
AU - Su, Zhaoli
AU - Song, Hong
AU - Lin, Yucong
AU - Wu, You
AU - Weng, Xutao
AU - Mao, Zhongxuan
AU - Liu, Bowen
AU - Yin, Hongxia
AU - Yang, Jian
N1 - Publisher Copyright:
© 2025
PY - 2026/1/15
Y1 - 2026/1/15
N2 - Radiology report generation automates the creation of clinically accurate and coherent paragraphs from medical images, reducing the heavy burden of report writing for radiologists. However, current research in this field still faces limitations and urgently requires breakthroughs in feature extraction of image knowledge and model fusion. In this paper, we propose a radiology report generation framework, MedKit, that integrates high information density knowledge fusion with multi-level task feature distillation. We leverage knowledge embedding fusion through a knowledge graph to reduce semantic hallucinations. Additionally, by employing feature extraction techniques within a multi-level task feature distillation architecture, comprehensive image feature information is provided for the primary task. For adapting 2D and 3D images, we propose different visual encoders respectively, which address the issue of inconsistent shapes in medical images. Finally, utilizing a multimodal large model framework enables the generated radiology report to closely approximate medical experts’ fluent expression. Our proposed model significantly outperformed the state-of-the-art model in the MIMIC-CXR dataset with a 20.1 % increase in the BLEU-4 score, from 0.134 to 0.161. We also achieved the best result on the private Liver-CT dataset. Our code is available at http://github.com/sujaly/MedKit.
AB - Radiology report generation automates the creation of clinically accurate and coherent paragraphs from medical images, reducing the heavy burden of report writing for radiologists. However, current research in this field still faces limitations and urgently requires breakthroughs in feature extraction of image knowledge and model fusion. In this paper, we propose a radiology report generation framework, MedKit, that integrates high information density knowledge fusion with multi-level task feature distillation. We leverage knowledge embedding fusion through a knowledge graph to reduce semantic hallucinations. Additionally, by employing feature extraction techniques within a multi-level task feature distillation architecture, comprehensive image feature information is provided for the primary task. For adapting 2D and 3D images, we propose different visual encoders respectively, which address the issue of inconsistent shapes in medical images. Finally, utilizing a multimodal large model framework enables the generated radiology report to closely approximate medical experts’ fluent expression. Our proposed model significantly outperformed the state-of-the-art model in the MIMIC-CXR dataset with a 20.1 % increase in the BLEU-4 score, from 0.134 to 0.161. We also achieved the best result on the private Liver-CT dataset. Our code is available at http://github.com/sujaly/MedKit.
KW - High information density knowledge injection
KW - Knowledge distillation
KW - Large language model
KW - Radiology report generation
UR - http://www.scopus.com/pages/publications/105011509850
U2 - 10.1016/j.eswa.2025.129003
DO - 10.1016/j.eswa.2025.129003
M3 - Article
AN - SCOPUS:105011509850
SN - 0957-4174
VL - 296
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 129003
ER -