The recent person re-identification research has achieved great success by learning from a large number of labeled person images. On the other hand, the learned models often experience significant performance drops when applied to images collected in a different environment. Unsupervised domain adaptation (UDA) has been investigated to mitigate this constraint, but most existing systems adapt images at pixel level only and ignore obvious discrepancies at spatial level. This paper presents an innovative UDA-based person re-identification network that is capable of adapting images at both spatial and pixel levels simultaneously. A novel disentangled cycle-consistency loss is designed which guides the learning of spatial-level and pixel-level adaptation in a collaborative manner. In addition, a novel multi-modal mechanism is incorporated which is capable of generating images of different geometry views and augmenting training images effectively. Extensive experiments over a number of public datasets show that the proposed UDA network achieves superior person re-identification performance as compared with the state-of-the-art.
翻译:最近的人重新定位研究通过从大量贴有标签的人图像中学习而取得巨大成功。另一方面,学习过的模型在应用到在不同环境中收集的图像时往往出现显著的性能下降。已经调查了无监督的域适应(UDA)以减轻这一制约,但大多数现有系统仅在像素一级调整图像,忽视空间层面的明显差异。本文介绍了一个创新的UDA个人再定位网络,它能够同时在空间和像素级别上调整图像。设计了一个新颖的分解周期一致性损失,用以指导以协作方式学习空间级和像素级别的适应。此外,还纳入了一个新的多模式机制,能够生成不同几何观点的图像,并有效增强培训图像。对一些公共数据集的广泛实验显示,拟议的UDA网络与最新技术相比,取得了更优秀的人再定位业绩。