Pedestrian detection plays an important role in many applications such as autonomous driving. We propose a method that explores semantic segmentation results as self-attention cues to significantly improve the pedestrian detection performance. Specifically, a multi-task network is designed to jointly learn semantic segmentation and pedestrian detection from image datasets with weak box-wise annotations. The semantic segmentation feature maps are concatenated with corresponding convolution features maps to provide more discriminative features for pedestrian detection and pedestrian classification. By jointly learning segmentation and detection, our proposed pedestrian self-attention mechanism can effectively identify pedestrian regions and suppress backgrounds. In addition, we propose to incorporate semantic attention information from multi-scale layers into deep convolution neural network to boost pedestrian detection. Experiment results show that the proposed method achieves the best detection performance with MR of 6.27% on Caltech dataset and obtain competitive performance on CityPersons dataset while maintaining high computational efficiency.
翻译:Pedestrian 探测在诸如自主驾驶等许多应用中起着重要作用。 我们提出一种方法,探索语义分离结果,作为自我注意的提示,大大改进行人探测性能。 具体地说, 多任务网络的设计是共同从图象数据集中学习语义分离和行人探测。 语义分离特征地图与相应的变异特征地图相融合,为行人探测和行人分类提供更具有歧视性的特征。 通过联合学习分解和探测,我们提议的行人自我注意机制可以有效地识别行人区域和压制背景。 此外,我们提议将多层的语义关注信息纳入深演进神经网络,以促进行人探测。 实验结果显示,拟议方法在Caltech数据集上实现了6.27%的MR的最佳检测性,并在保持高计算效率的同时,在CityPersons数据集上取得了竞争性表现。