Analysis and interpretation of egocentric video data is becoming more and more important with the increasing availability and use of wearable cameras. Exploring and fully understanding affinities and differences between ego and allo (or third-person) vision is paramount for the design of effective methods to process, analyse and interpret egocentric data. In addition, a deeper understanding of ego-vision and its peculiarities may enable new research perspectives in which first person viewpoints can act either as a mean for easily acquiring large amounts of data to be employed in general-purpose recognition systems, and as a challenging test-bed to assess the usability of techniques specifically tailored to deal with allocentric vision on more challenging settings. Our work, with an eye to cognitive science findings, leverages transfer learning in Convolutional Neural Networks to demonstrate capabilities and limitations of an implicitly learnt view-invariant representation in the specific case of action recognition.
翻译:随着可磨损相机的可用性和利用率的提高,对以自我为中心的视频数据的分析和解释变得越来越重要。探索和充分理解自我和同龄人(或第三人)愿景之间的亲近性和差异对于设计处理、分析和解释以自我为中心的数据的有效方法至关重要。此外,更深入地了解自我观点及其特殊性,可以带来新的研究视角,其中第一人的观点可以作为方便获取大量数据的工具,用于普通用途识别系统,并作为一种具有挑战性的测试台,用以评估专门为处理更具挑战性环境中的以全球为中心的愿景而专门设计的技术的可用性。我们的工作,从认知科学的研究结果出发,利用进化神经网络的学习来展示在行动识别的具体案例中隐性学习的视觉差异代表的能力和局限性。