Place recognition is a critical component of autonomous vehicles and robotics, enabling global localization in GPS-denied environments. Recent advances have spurred significant interest in multimodal place recognition (MPR), which leverages complementary strengths of multiple modalities. Despite its potential, most existing MPR methods still face three key challenges: (1) dynamically adapting to various modality inputs within a unified framework, (2) maintaining robustness with missing or degraded modalities, and (3) generalizing across diverse sensor configurations and setups. In this paper, we propose UniMPR, a unified framework for multimodal place recognition. Using only one trained model, it can seamlessly adapt to any combination of common perceptual modalities (e.g., camera, LiDAR, radar). To tackle the data heterogeneity, we unify all inputs within a polar BEV feature space. Subsequently, the polar BEVs are fed into a multi-branch network to exploit discriminative intra-model and inter-modal features from any modality combinations. To fully exploit the network's generalization capability and robustness, we construct a large-scale training set from multiple datasets and introduce an adaptive label assignment strategy for extensive pre-training. Experiments on seven datasets demonstrate that UniMPR achieves state-of-the-art performance under varying sensor configurations, modality combinations, and environmental conditions. Our code will be released at https://github.com/QiZS-BIT/UniMPR.
翻译:地点识别是自动驾驶车辆和机器人的关键组成部分,使其能够在GPS拒止环境中实现全局定位。近期进展极大地激发了人们对多模态地点识别(MPR)的兴趣,该技术能够利用多种模态的互补优势。尽管潜力巨大,但现有大多数MPR方法仍面临三个关键挑战:(1) 在统一框架内动态适应各种模态输入,(2) 在模态缺失或退化时保持鲁棒性,(3) 在不同传感器配置与设置间实现泛化。本文提出UniMPR,一种用于多模态地点识别的统一框架。该框架仅使用一个训练好的模型,即可无缝适应常见感知模态(例如相机、激光雷达、毫米波雷达)的任意组合。为应对数据异构性,我们将所有输入统一至极坐标鸟瞰图特征空间。随后,极坐标鸟瞰图被馈入多分支网络,以从任意模态组合中挖掘具有判别性的模态内与跨模态特征。为充分发挥网络的泛化能力与鲁棒性,我们从多个数据集中构建了大规模训练集,并引入自适应标签分配策略进行大规模预训练。在七个数据集上的实验表明,UniMPR在变化的传感器配置、模态组合及环境条件下均实现了最先进的性能。我们的代码将在https://github.com/QiZS-BIT/UniMPR发布。