关于以视频为由判处时刑的调查 (A Survey on Temporal Sentence Grounding in Videos)

Temporal sentence grounding in videos(TSGV), which aims to localize one target segment from an untrimmed video with respect to a given sentence query, has drawn increasing attentions in the research community over the past few years. Different from the task of temporal action localization, TSGV is more flexible since it can locate complicated activities via natural languages, without restrictions from predefined action categories. Meanwhile, TSGV is more challenging since it requires both textual and visual understanding for semantic alignment between two modalities(i.e., text and video). In this survey, we give a comprehensive overview for TSGV, which i) summarizes the taxonomy of existing methods, ii) provides a detailed description of the evaluation protocols(i.e., datasets and metrics) to be used in TSGV, and iii) in-depth discusses potential problems of current benchmarking designs and research directions for further investigations. To the best of our knowledge, this is the first systematic survey on temporal sentence grounding. More specifically, we first discuss existing TSGV approaches by grouping them into four categories, i.e., two-stage methods, end-to-end methods, reinforcement learning-based methods, and weakly supervised methods. Then we present the benchmark datasets and evaluation metrics to assess current research progress. Finally, we discuss some limitations in TSGV through pointing out potential problems improperly resolved in the current evaluation protocols, which may push forwards more cutting edge research in TSGV. Besides, we also share our insights on several promising directions, including three typical tasks with new and practical settings based on TSGV.

翻译：以视频(TSGV)作为时间刑判决基础的视频(TSGV)旨在将一个目标部分从未剪辑的视频中定位到特定句号查询,过去几年来,这引起了研究界越来越多的关注。与时间行动定位任务不同,TSGV更灵活,因为它可以通过自然语言找到复杂的活动,而不受预先界定的行动类别的限制。与此同时,TSGV更具挑战性,因为它要求用文字和视觉理解两种模式(即文字和视频)之间的语义一致性。在本次调查中,我们首先通过将现有的TSGV方法分为四类,一是对现有方法的分类,二是对现有方法的分类,二是对现有方法的分类,二是对现有方法的分类,二是对现有方法的分类,二是现有方法的分类,二是对现有方法的分类,二是分享现有方法的分类,三是用于实时行动定位的分类,然后是评估,我们从目前的前期评估方法,最后是前期评估,最后是前期评估方法,最后是前期评估,最后是前期评估方法,最后是前期评估,我们学习一些前期研究的先期评估,最后方法,最后是前期评估,最后方法,最后是前期评估,我们学习一些前期评估,最后方法,从前期评估,最后方法,我们学习前期研究方法,最后研究,从前期评估,最后研究,我们学习到后期评估,我们学习一些研究的先期评估,最后方法。