In this paper we present a deep-learning based framework for direct camera pose regression and refinement using RGB information only. For this aim we introduce a novel framework for camera pose estimation, that regresses the camera pose as well as offers a solely RGB-based solution for camera pose refinement. Utilizing research results of recent camera pose regression methods, we investigate the effect of adversarial networks on convolutional neural networks (CNNs) trained for camera re-localization applications, with the goal to better learn the geometric connection between camera pose and corresponding RGB image. Similar to Generative Adversarial Networks (GANs), in addition to a camera pose regressor, mapping images to poses, we propose to train a discriminator that effectively distinguishes between regressed and ground truth poses. This pose discriminator is conditioned on features extracted from the respective input image to implicitly model the relationship between ground truth or regressed poses, and once learned can be used to update the predicted camera poses and improve the localization accuracy.
翻译:在本文中,我们只用 RGB 信息来展示直接摄像头的深层学习基础框架,只使用 RGB 信息进行回归和完善。为此,我们引入了一个新的相机显示估计框架,使相机显示倒退,并为相机提供一种完全基于 RGB 的解决方案。利用最近摄像头的研究结果提出回归方法,我们调查敌对网络对为相机重新定位应用程序而培训的进化神经网络的影响,目的是更好地学习相机显示与相应的 RGB 图像之间的几何联系。与Genemental Aversarial 网络(GANs)相似,除了相机显示倒退,绘制图像以显示外,我们提议训练一个区分器,有效地区分倒退和地面真相所构成的区别。这种构成的偏差取决于从各自输入图像中提取的特征,以隐含地面真相或倒退姿势之间的关系为模型,一旦学到知识,就可以用来更新预测的摄像头,并提高本地化的精确度。