Regular Papers

International Journal of Control, Automation, and Systems 2024; 22(9): 2899-2908

https://doi.org/10.1007/s12555-024-0089-8

© The International Journal of Control, Automation, and Systems

CSI-Net: CNN Swin Transformer Integrated Network for Infrared Small Target Detection

Lammi Choi, Won Young Chung, and Chan Gook Park*

Seoul National University

Abstract

In the realm of infrared (IR) small target detection, pinpointing blurry and low-contrast targets accurately is immensely challenging due to the intricate features of IR images. To tackle this, we introduce CSI-Net, a novel network architecture merging CNN and swin transformer. CSI-Net features a hybrid encoder design, blending encoder-decoder layout of UNet with swin transformer’s parallel execution alongside CNN. This amalgamation enables the network to capture local features and long-distance dependencies, enhancing its ability to accurately identify small targets. Leveraging hierarchical features of swin transformer, CSI-Net adeptly grasps contextual information crucial for small target detection. Moreover, CSI-Net employs full-scale skip connections over encoder-decoder and decoder-decoder, integrating multiscale CNN and swin transformer features to improve gradient propagation. Experimental results validate superiority of proposed method over traditional CNN and Transformer methods. At NUAA-SIRST, metrics like mIoU (0.7483), detection probability (0.9734), and false alarm rates (0.101 × 10−5) demonstrate significant improvement. Similarly, at NUDT-SIRST, values like mIoU (0.8887), detection probability (0.9894), and false alarm rates (0.431×10−5) show notable enhancement. The performance of network scales with dataset size, and its robustness is affirmed by the area under the ROC curve (AUC). Additionally, an ablation study validates the efficacy of hybrid encoder. Varying the presence of the parallel swin transformer module (PSM) reveals that its application enhances small target detection performance. The comprehensive evaluation shows that the swin transformer-enhanced UNet architecture effectively tackles the challenges of IR small target detection.

Keywords Hybrid encoder, infrared (IR) image, small target detection, swin transformer, UNet.

Article

Regular Papers

International Journal of Control, Automation, and Systems 2024; 22(9): 2899-2908

Published online September 1, 2024 https://doi.org/10.1007/s12555-024-0089-8

Copyright © The International Journal of Control, Automation, and Systems.

CSI-Net: CNN Swin Transformer Integrated Network for Infrared Small Target Detection

Lammi Choi, Won Young Chung, and Chan Gook Park*

Seoul National University

Abstract

In the realm of infrared (IR) small target detection, pinpointing blurry and low-contrast targets accurately is immensely challenging due to the intricate features of IR images. To tackle this, we introduce CSI-Net, a novel network architecture merging CNN and swin transformer. CSI-Net features a hybrid encoder design, blending encoder-decoder layout of UNet with swin transformer’s parallel execution alongside CNN. This amalgamation enables the network to capture local features and long-distance dependencies, enhancing its ability to accurately identify small targets. Leveraging hierarchical features of swin transformer, CSI-Net adeptly grasps contextual information crucial for small target detection. Moreover, CSI-Net employs full-scale skip connections over encoder-decoder and decoder-decoder, integrating multiscale CNN and swin transformer features to improve gradient propagation. Experimental results validate superiority of proposed method over traditional CNN and Transformer methods. At NUAA-SIRST, metrics like mIoU (0.7483), detection probability (0.9734), and false alarm rates (0.101 × 10−5) demonstrate significant improvement. Similarly, at NUDT-SIRST, values like mIoU (0.8887), detection probability (0.9894), and false alarm rates (0.431×10−5) show notable enhancement. The performance of network scales with dataset size, and its robustness is affirmed by the area under the ROC curve (AUC). Additionally, an ablation study validates the efficacy of hybrid encoder. Varying the presence of the parallel swin transformer module (PSM) reveals that its application enhances small target detection performance. The comprehensive evaluation shows that the swin transformer-enhanced UNet architecture effectively tackles the challenges of IR small target detection.

Keywords: Hybrid encoder, infrared (IR) image, small target detection, swin transformer, UNet.

IJCAS
September 2024

Vol. 22, No. 9, pp. 2673~2953

Stats or Metrics

Share this article on

  • line

IJCAS

eISSN 2005-4092
pISSN 1598-6446