跨模式合作代表学习和大型RGBT人口计数基准 (Cross-Modal Collaborative Representation Learning and a Large-Scale RGBT Benchmark for Crowd Counting)

from arxiv, Accepted by CVPR2021. Our code and benchmark for RGBT crowd counting are released at {\url{http://lingboliu.com/RGBT_Crowd_Counting.html}}

Crowd counting is a fundamental yet challenging task, which desires rich information to generate pixel-wise crowd density maps. However, most previous methods only used the limited information of RGB images and cannot well discover potential pedestrians in unconstrained scenarios. In this work, we find that incorporating optical and thermal information can greatly help to recognize pedestrians. To promote future researches in this field, we introduce a large-scale RGBT Crowd Counting (RGBT-CC) benchmark, which contains 2,030 pairs of RGB-thermal images with 138,389 annotated people. Furthermore, to facilitate the multimodal crowd counting, we propose a cross-modal collaborative representation learning framework, which consists of multiple modality-specific branches, a modality-shared branch, and an Information Aggregation-Distribution Module (IADM) to capture the complementary information of different modalities fully. Specifically, our IADM incorporates two collaborative information transfers to dynamically enhance the modality-shared and modality-specific representations with a dual information propagation mechanism. Extensive experiments conducted on the RGBT-CC benchmark demonstrate the effectiveness of our framework for RGBT crowd counting. Moreover, the proposed approach is universal for multimodal crowd counting and is also capable to achieve superior performance on the ShanghaiTechRGBD dataset. Finally, our source code and benchmark are released at {\url{http://lingboliu.com/RGBT_Crowd_Counting.html}}.

翻译：众人计数是一项具有挑战性的基本任务,它需要丰富的信息来生成像素的群密度地图。然而,大多数以往方法仅使用RGB图像的有限信息,无法在不受限制的情况下发现行人。在这项工作中,我们发现,纳入光学和热学信息可以极大地帮助识别行人。为了促进这一领域的未来研究,我们引入了大规模 RGBT 人群计数基准(RGBT-CC),其中包括2 030对RGB-热量图像和138 389个附加说明的人。此外,为了便利多式人群计数,我们建议了一个跨模式合作代表性学习框架,由多种模式特定分支、模式共享分支和信息聚合分布模块组成,以充分捕捉不同模式的补充信息。具体地说,我们IADMt包含两个协作性信息传输工具,以动态方式加强模式共享和具体模式的表达方式,并配有双重信息传播机制。在RGBT-CC基准上进行的广泛实验,展示了我们在RGBT-C 群集点计算时的框架的有效性。最后,我们提出的MRG_RG_salb_salgalation sal supalate supalation supal sal suplate suplateg supation suplegations suplationslategationslationslationsal_