机器人人类探测的交叉模式分析:工业案例研究 (Cross-Modal Analysis of Human Detection for Robotics: An Industrial Case Study)

Advances in sensing and learning algorithms have led to increasingly mature solutions for human detection by robots, particularly in selected use-cases such as pedestrian detection for self-driving cars or close-range person detection in consumer settings. Despite this progress, the simple question "which sensor-algorithm combination is best suited for a person detection task at hand?" remains hard to answer. In this paper, we tackle this issue by conducting a systematic cross-modal analysis of sensor-algorithm combinations typically used in robotics. We compare the performance of state-of-the-art person detectors for 2D range data, 3D lidar, and RGB-D data as well as selected combinations thereof in a challenging industrial use-case. We further address the related problems of data scarcity in the industrial target domain, and that recent research on human detection in 3D point clouds has mostly focused on autonomous driving scenarios. To leverage these methodological advances for robotics applications, we utilize a simple, yet effective multi-sensor transfer learning strategy by extending a strong image-based RGB-D detector to provide cross-modal supervision for lidar detectors in the form of weak 3D bounding box labels. Our results show a large variance among the different approaches in terms of detection performance, generalization, frame rates and computational requirements. As our use-case contains difficulties representative for a wide range of service robot applications, we believe that these results point to relevant open challenges for further research and provide valuable support to practitioners for the design of their robot system.

翻译：遥感和学习算法的进步导致机器人探测人类的日益成熟的解决方案,特别是在消费者环境中的自我驾驶汽车行人探测或近距离人探测等特定使用情况中。尽管取得了这一进展,但简单的问题“最适合手头的人检测任务的是哪种传感器-高分组合?”仍然难以回答。在本文件中,我们通过对机器人通常使用的传感器-高分组合进行系统的跨模式分析来解决这一问题。我们比较了2D射程数据、3D里达尔和RGB-D数据以及具有挑战性的工业使用情况下的选定组合等最先进的使用人探测器的性能。我们进一步解决了工业目标领域数据稀缺的相关问题,以及最近对3D点云中人类检测的研究主要侧重于自主驱动情景。为了利用机器人应用的这些方法进步,我们利用简单而有效的多感应器传输学习战略,在基于图像的2D范围数据检测数据、3D数据和RGB-D数据中进一步提供基于不同模式的操作技术的跨模式数据探测器的性能。我们进一步提供跨模式监督,在测试的测试中,在测试标准上,我们使用一个常规测试系统中,在测试的大规模测试标准测试中,提供了一种常规测试的系统,在测试的系统测试中,在标准上,在测试中,我们的标准中,我们使用这些测试的大规模测试的系统应用的大规模测试中,提供了一种方法的大规模测试。