Street scenes object detection based on infrared images and improved YOLOv5 network

Ailing Tan, Xiaohang Li, Yong Zhao*, Meijing Gao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

To address the issues of object detection within infrared street scene images, such as low resolution and significant disparities in the feature scale of targets, we propose an MD-YOLOv5 network to improve the detection accuracy of bicycles, cars, and pedestrians. Based on coordinate attention, a multiscale coordinate attention module was designed to simultaneously extract both multiscale spatial features and channel features through pooling of different scales. A dense-C3 structure based on a dense connection approach was designed in the YOLOv5 backbone network to strengthen the transmission of features. Using the internationally available FLIR dataset, the experimental results show that mAP@0.5 and mAP@0.5:0.95 of MD-YOLOv5 reached 80.1% and 41.2%, respectively. Compared with SSD, YOLOv4, YOLOv5, YOLOv8, and YOLOv11, the accuracy of the proposed MD-YOLOv5 methodology has been increased by 16.98%, 13.3%, 2.7%, 2.5%, and 2.3%, respectively. The object detection method based on multiscale coordinate attention and dense-C3 structure proposed in this paper offers a new approach to the detection of infrared images.

Original languageEnglish
Article number033048
JournalJournal of Electronic Imaging
Volume34
Issue number3
DOIs
Publication statusPublished - 1 May 2025
Externally publishedYes

Keywords

  • dense-C3
  • infrared images
  • multiscale coordinate attention
  • object detection
  • YOLOv5

Fingerprint

Dive into the research topics of 'Street scenes object detection based on infrared images and improved YOLOv5 network'. Together they form a unique fingerprint.

Cite this