MIHNet: Multi-input hierarchical infrared image super-resolution method via collaborative CNN and Transformer

Yang Bai; Meijing Gao; Huanyu Sun; Sibo Chen; Yunjia Xie; Yonghao Yan; Xiangrui Fan

doi:10.1016/j.infrared.2025.106004

MIHNet: Multi-input hierarchical infrared image super-resolution method via collaborative CNN and Transformer

Yang Bai, Meijing Gao^*, Huanyu Sun, Sibo Chen, Yunjia Xie, Yonghao Yan, Xiangrui Fan

^*此作品的通讯作者

集成电路与电子学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

Due to the low spatial resolution of infrared imaging systems, the acquired images typically suffer from low contrast, insufficient detail, and blurred edges. To address this issue, this paper proposes a multi-input hierarchical infrared image super-resolution reconstruction method based on collaborative CNN and Transformer, termed MIHNet. The network adopts a multi-input encoder–decoder structure as the framework. Firstly, a Local–Global Feature Perception Module (LGFPM) is designed, consisting of the constructed Local Texture Attention Unit (LTAU) and the Global Transformer Attention Unit (GTAU), aimed at simultaneously enhancing the local detail and global structure reconstruction capabilities of infrared images. Secondly, a Feature Refinement Module (FRM) is constructed to enhance the encoded feature expression. Then, a Multi-level Feature Fusion (MFF) module is designed to fuse the encoding stage's features adaptively. Finally, a mixed loss function composed of pixel loss, structure loss, and texture loss is constructed to guide network optimization. Experiments on three public datasets demonstrate that the proposed method outperforms thirteen other comparison algorithms in subjective and objective evaluations. Furthermore, this method has been verified in the downstream task of infrared and visible image fusion, which further demonstrates that MIHNet achieves a good SR reconstruction effect.

源语言	英语
文章编号	106004
期刊	Infrared Physics and Technology
卷	150
DOI	http://doi.org/10.1016/j.infrared.2025.106004
出版状态	已出版 - 11月 2025

访问文件

10.1016/j.infrared.2025.106004

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{1961ea0069944bbdb903d32595a16ace,

title = "MIHNet: Multi-input hierarchical infrared image super-resolution method via collaborative CNN and Transformer",

abstract = "Due to the low spatial resolution of infrared imaging systems, the acquired images typically suffer from low contrast, insufficient detail, and blurred edges. To address this issue, this paper proposes a multi-input hierarchical infrared image super-resolution reconstruction method based on collaborative CNN and Transformer, termed MIHNet. The network adopts a multi-input encoder–decoder structure as the framework. Firstly, a Local–Global Feature Perception Module (LGFPM) is designed, consisting of the constructed Local Texture Attention Unit (LTAU) and the Global Transformer Attention Unit (GTAU), aimed at simultaneously enhancing the local detail and global structure reconstruction capabilities of infrared images. Secondly, a Feature Refinement Module (FRM) is constructed to enhance the encoded feature expression. Then, a Multi-level Feature Fusion (MFF) module is designed to fuse the encoding stage's features adaptively. Finally, a mixed loss function composed of pixel loss, structure loss, and texture loss is constructed to guide network optimization. Experiments on three public datasets demonstrate that the proposed method outperforms thirteen other comparison algorithms in subjective and objective evaluations. Furthermore, this method has been verified in the downstream task of infrared and visible image fusion, which further demonstrates that MIHNet achieves a good SR reconstruction effect.",

keywords = "CNN, Deep learning, Image super-resolution, Infrared images, Transformer",

author = "Yang Bai and Meijing Gao and Huanyu Sun and Sibo Chen and Yunjia Xie and Yonghao Yan and Xiangrui Fan",

note = "Publisher Copyright: {\textcopyright} 2025 Elsevier B.V.",

year = "2025",

month = nov,

doi = "10.1016/j.infrared.2025.106004",

language = "English",

volume = "150",

journal = "Infrared Physics and Technology",

issn = "1350-4495",

publisher = "Elsevier B.V.",

}

TY - JOUR

T1 - MIHNet

T2 - Multi-input hierarchical infrared image super-resolution method via collaborative CNN and Transformer

AU - Bai, Yang

AU - Gao, Meijing

AU - Sun, Huanyu

AU - Chen, Sibo

AU - Xie, Yunjia

AU - Yan, Yonghao

AU - Fan, Xiangrui

PY - 2025/11

Y1 - 2025/11

N2 - Due to the low spatial resolution of infrared imaging systems, the acquired images typically suffer from low contrast, insufficient detail, and blurred edges. To address this issue, this paper proposes a multi-input hierarchical infrared image super-resolution reconstruction method based on collaborative CNN and Transformer, termed MIHNet. The network adopts a multi-input encoder–decoder structure as the framework. Firstly, a Local–Global Feature Perception Module (LGFPM) is designed, consisting of the constructed Local Texture Attention Unit (LTAU) and the Global Transformer Attention Unit (GTAU), aimed at simultaneously enhancing the local detail and global structure reconstruction capabilities of infrared images. Secondly, a Feature Refinement Module (FRM) is constructed to enhance the encoded feature expression. Then, a Multi-level Feature Fusion (MFF) module is designed to fuse the encoding stage's features adaptively. Finally, a mixed loss function composed of pixel loss, structure loss, and texture loss is constructed to guide network optimization. Experiments on three public datasets demonstrate that the proposed method outperforms thirteen other comparison algorithms in subjective and objective evaluations. Furthermore, this method has been verified in the downstream task of infrared and visible image fusion, which further demonstrates that MIHNet achieves a good SR reconstruction effect.

AB - Due to the low spatial resolution of infrared imaging systems, the acquired images typically suffer from low contrast, insufficient detail, and blurred edges. To address this issue, this paper proposes a multi-input hierarchical infrared image super-resolution reconstruction method based on collaborative CNN and Transformer, termed MIHNet. The network adopts a multi-input encoder–decoder structure as the framework. Firstly, a Local–Global Feature Perception Module (LGFPM) is designed, consisting of the constructed Local Texture Attention Unit (LTAU) and the Global Transformer Attention Unit (GTAU), aimed at simultaneously enhancing the local detail and global structure reconstruction capabilities of infrared images. Secondly, a Feature Refinement Module (FRM) is constructed to enhance the encoded feature expression. Then, a Multi-level Feature Fusion (MFF) module is designed to fuse the encoding stage's features adaptively. Finally, a mixed loss function composed of pixel loss, structure loss, and texture loss is constructed to guide network optimization. Experiments on three public datasets demonstrate that the proposed method outperforms thirteen other comparison algorithms in subjective and objective evaluations. Furthermore, this method has been verified in the downstream task of infrared and visible image fusion, which further demonstrates that MIHNet achieves a good SR reconstruction effect.

KW - CNN

KW - Deep learning

KW - Image super-resolution

KW - Infrared images

KW - Transformer

UR - http://www.scopus.com/pages/publications/105011490991

U2 - 10.1016/j.infrared.2025.106004

DO - 10.1016/j.infrared.2025.106004

M3 - Article

AN - SCOPUS:105011490991

SN - 1350-4495

VL - 150

JO - Infrared Physics and Technology

JF - Infrared Physics and Technology

M1 - 106004

ER -

MIHNet: Multi-input hierarchical infrared image super-resolution method via collaborative CNN and Transformer

摘要

访问文件

其它文件与链接

指纹

引用此