Special Issue: ICCAS 2024

International Journal of Control, Automation, and Systems 2025; 23(2): 418-428

https://doi.org/10.1007/s12555-024-0487-y

© The International Journal of Control, Automation, and Systems

Leveraging Spatial Attention and Edge Context for Optimized Feature Selection in Visual Localization

Nanda Febri Istighfarin and HyungGi Jo*

Jeonbuk National University

Abstract

Visual localization determines an agent’s precise position and orientation within an environment using visual data. It has become a critical task in the field of robotics, particularly in applications such as autonomous navigation. This is due to the ability to determine an agent’s pose using cost-effective sensors such as RGB cameras. Recent methods in visual localization employ scene coordinate regression to determine the agent’s pose. However, these methods face challenges as they attempt to regress 2D-3D correspondences across the entire image region, despite not all regions providing useful information. To address this issue, we introduce an attention network that selectively targets informative regions of the image. Using this network, we identify the highest-scoring features to improve the feature selection process and combine the result with edge detection. This integration ensures that the features chosen for the training buffer are located within robust regions, thereby improving 2D-3D correspondence and overall localization performance. Our approach was tested on the outdoor benchmark dataset, demonstrating superior results compared to previous methods.

Keywords Attention network, computer vision, edge detector, scene coordinate regression, visual localization.

Article

Special Issue: ICCAS 2024

International Journal of Control, Automation, and Systems 2025; 23(2): 418-428

Published online February 1, 2025 https://doi.org/10.1007/s12555-024-0487-y

Copyright © The International Journal of Control, Automation, and Systems.

Leveraging Spatial Attention and Edge Context for Optimized Feature Selection in Visual Localization

Nanda Febri Istighfarin and HyungGi Jo*

Jeonbuk National University

Abstract

Visual localization determines an agent’s precise position and orientation within an environment using visual data. It has become a critical task in the field of robotics, particularly in applications such as autonomous navigation. This is due to the ability to determine an agent’s pose using cost-effective sensors such as RGB cameras. Recent methods in visual localization employ scene coordinate regression to determine the agent’s pose. However, these methods face challenges as they attempt to regress 2D-3D correspondences across the entire image region, despite not all regions providing useful information. To address this issue, we introduce an attention network that selectively targets informative regions of the image. Using this network, we identify the highest-scoring features to improve the feature selection process and combine the result with edge detection. This integration ensures that the features chosen for the training buffer are located within robust regions, thereby improving 2D-3D correspondence and overall localization performance. Our approach was tested on the outdoor benchmark dataset, demonstrating superior results compared to previous methods.

Keywords: Attention network, computer vision, edge detector, scene coordinate regression, visual localization.

IJCAS
February 2025

Vol. 23, No. 2, pp. 359~682

Stats or Metrics

Share this article on

  • line

Related articles in IJCAS

IJCAS

eISSN 2005-4092
pISSN 1598-6446