ConGeo: Robust Cross-view Geo-localization across Ground View Variations
arxiv(2024)
摘要
Cross-view geo-localization aims at localizing a ground-level query image by
matching it to its corresponding geo-referenced aerial view. In real-world
scenarios, the task requires accommodating diverse ground images captured by
users with varying orientations and reduced field of views (FoVs). However,
existing learning pipelines are orientation-specific or FoV-specific, demanding
separate model training for different ground view variations. Such models
heavily depend on the North-aligned spatial correspondence and predefined FoVs
in the training data, compromising their robustness across different settings.
To tackle this challenge, we propose ConGeo, a single- and cross-modal
Contrastive method for Geo-localization: it enhances robustness and consistency
in feature representations to improve a model's invariance to orientation and
its resilience to FoV variations, by enforcing proximity between ground view
variations of the same location. As a generic learning objective for cross-view
geo-localization, when integrated into state-of-the-art pipelines, ConGeo
significantly boosts the performance of three base models on four
geo-localization benchmarks for diverse ground view variations and outperforms
competing methods that train separate models for each ground view variation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要