Most Siamese network-based trackers perform the tracking process without model update, and cannot learn targetspecific variation adaptively. Moreover, Siamese-based trackers infer the new state of tracked objects by generating axis-aligned bounding boxes, which contain extra background noise, and are unable to accurately estimate the rotation and scale transformation of moving objects, thus potentially reducing tracking performance. In this paper, we propose a novel Rotation-Scale Invariant Network (RSINet) to address the above problem. Our RSINet tracker consists of a target-distractor discrimination branch and a rotation-scale estimation branch, the rotation and scale knowledge can be explicitly learned by a multi-task learning method in an end-to-end manner. In addtion, the tracking model is adaptively optimized and updated under spatio-temporal energy control, which ensures model stability and reliability, as well as high tracking efficiency. Comprehensive experiments on OTB-100, VOT2018, and LaSOT benchmarks demonstrate that our proposed RSINet tracker yields new state-of-the-art performance compared with recent trackers, while running at real-time speed about 45 FPS.
翻译:大部分以暹罗网络为基础的跟踪器在不进行模型更新的情况下进行跟踪过程,并且无法从适应性上学习特定目标变异。此外,以暹罗为基础的跟踪器通过生成具有额外背景噪音且无法准确估计移动物体的旋转和规模转换,从而可能降低跟踪性能的多数暹罗网络跟踪器进行跟踪过程,在本文中,我们提议建立一个新的旋转-差异规模网络(RSINet),以解决上述问题。我们的RSINet跟踪器包括一个目标-吸引者歧视分支和一个轮值估计分支,可以通过一个多任务学习方法,以端到端的方式明确了解跟踪物体的新状态。加之,跟踪模型在波形时能控制下进行了适应性优化和更新,确保模型的稳定性和可靠性,以及高跟踪效率。关于OTB-100、VOT2018和LaSOT基准的全面实验表明,我们提议的RSINet跟踪器与最近的跟踪器相比,可以产生新的状态-艺术性能,同时以实时速度运行45个FPS。