以自抽样学习法进行自我监督的深度估算 (Self-Supervised Depth Estimation with Isometric-Self-Sample-Based Learning)

Managing the dynamic regions in the photometric loss formulation has been a main issue for handling the self-supervised depth estimation problem. Most previous methods have alleviated this issue by removing the dynamic regions in the photometric loss formulation based on the masks estimated from another module, making it difficult to fully utilize the training images. In this paper, to handle this problem, we propose an isometric self-sample-based learning (ISSL) method to fully utilize the training images in a simple yet effective way. The proposed method provides additional supervision during training using self-generated images that comply with pure static scene assumption. Specifically, the isometric self-sample generator synthesizes self-samples for each training image by applying random rigid transformations on the estimated depth. Thus both the generated self-samples and the corresponding training image always follow the static scene assumption. We show that plugging our ISSL module into several existing models consistently improves the performance by a large margin. In addition, it also boosts the depth accuracy over different types of scene, i.e., outdoor scenes (KITTI and Make3D) and indoor scene (NYUv2), validating its high effectiveness.

翻译：在光度损失配方中管理动态区域一直是处理自我监督深度估计问题的一个主要问题。以往大多数方法都通过去除基于从另一个模块估计的面罩的光度损失配方中的动态区域而缓解了这一问题, 这使得很难充分利用培训图像。在本文中,为了处理这一问题,我们建议采用一种以简单而有效的方式充分利用培训图像的自标自标学习方法(ISSSL)来充分利用培训图像。拟议的方法在培训中使用符合纯静态场景假设的自生成图像时提供了额外的监督。具体地说, 光度自标自标发电机通过在估计深度上应用随机的僵硬变形来合成每个培训图像的自标样。因此,生成的自标本和相应的培训图像总是跟随静态场假设。我们表明,将我们的ISL模块插入若干现有模型,能够不断以大幅度提高性能。此外,它还提高了不同场景类型(即室外场(KITTI和Make3D)和室内场景(NYUV2)的深度。