Improving multi-view aggregation is integral for multi-view pedestrian detection, which aims to obtain a bird's-eye-view pedestrian occupancy map from images captured through a set of calibrated cameras. Inspired by the success of attention modules for deep neural networks, we first propose a Homography Attention Module (HAM) which is shown to boost the performance of existing end-to-end multiview detection approaches by utilizing a novel channel gate and spatial gate. Additionally, we propose Booster-SHOT, an end-to-end convolutional approach to multiview pedestrian detection incorporating our proposed HAM as well as elements from previous approaches such as view-coherent augmentation or stacked homography transformations. Booster-SHOT achieves 92.9% and 94.2% for MODA on Wildtrack and MultiviewX respectively, outperforming the state-of-the-art by 1.4% on Wildtrack and 0.5% on MultiviewX, achieving state-of-the-art performance overall for standard evaluation metrics used in multi-view pedestrian detection.
翻译:改进多视角聚合是多视角行人探测所不可或缺的,其目的是从一组校准相机拍摄的图像中获取鸟的视视行人占用图。在深神经网络关注模块的成功激励下,我们首先提议了“同声关注模块”(HAM),该模块通过使用新颖的频道门和空间门,提高现有端到端多视角检测方法的性能。此外,我们提议了“Booster-SHOT”,这是多视角行人探测的端到端的连动方法,包括我们提议的HAM以及以前方法中的一些要素,如视觉相近增强或堆叠式同影转换。“Booster-SHOAT”在 Writt和多视图X分别实现了92.9%和94.2%的MODO,在Wild轨道和多视图X上比最新水平高1.4%和0.5%,在多视角行人探测中使用的标准评价指标方面达到了最先进的总体性能。