A Latent Diffusion Model for Heart Sound Synthesis

Yang Tan, Haojie Zhang, Yi Chang, Qingrong Wu, Kun Qian*, Bin Hu*, Bjorn W. Schuller, Yoshiharu Yamamoto

*此作品的通讯作者

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

There are already many analyses of heart sounds used to develop diagnostic systems for the heart condition. However, the existing heart sound database state is not sufficient to train a large number of deep learning models and is not balanced - the normal heart sounds are always more present than abnormal heart sounds. Hence, we are interested in algorithms that can generate heart sounds as they can enhance the current database state. We enter the field of large-scale modelling in medical synthesis by proposing a latent diffusion model for heart sound generation, which can generate highly realistic heart sounds. We further guide the synthesis process through text prompts and labels, revealing the research field of prompted heart sound synthesis. In terms of experimental results, our proposed method achieves good results compared to existing mathematical models. At the same time, when the generated audio is used as an unlabelled dataset in semi-supervised learning, it shows an improvement effect on the classification model. This indicates that the heart sounds generated by the diffusion model bear great value.

源语言英语
主期刊名2025 4th International Symposium on Computer Applications and Information Technology, ISCAIT 2025
出版商Institute of Electrical and Electronics Engineers Inc.
1314-1319
页数6
ISBN(电子版)9798331542856
DOI
出版状态已出版 - 2025
已对外发布
活动4th International Symposium on Computer Applications and Information Technology, ISCAIT 2025 - Xi'an, 中国
期限: 21 3月 202523 3月 2025

出版系列

姓名2025 4th International Symposium on Computer Applications and Information Technology, ISCAIT 2025

会议

会议4th International Symposium on Computer Applications and Information Technology, ISCAIT 2025
国家/地区中国
Xi'an
时期21/03/2523/03/25

指纹

探究 'A Latent Diffusion Model for Heart Sound Synthesis' 的科研主题。它们共同构成独一无二的指纹。

引用此