Temporal interpolation has the potential to be a powerful tool for video compression. Existing methods for frame interpolation do not discriminate between video textures and generally invoke a single general model capable of interpolating a wide range of video content. However, past work on video texture analysis and synthesis has shown that different textures exhibit vastly different motion characteristics and they can be divided into three classes (static, dynamic continuous and dynamic discrete). In this work, we study the impact of video textures on video frame interpolation, and propose a novel framework where, given an interpolation algorithm, separate models are trained on different textures. Our study shows that video texture has significant impact on the performance of frame interpolation models and it is beneficial to have separate models specifically adapted to these texture classes, instead of training a single model that tries to learn generic motion. Our results demonstrate that models fine-tuned using our framework achieve, on average, a 0.3dB gain in PSNR on the test set used.
翻译:现有的框架内插方法并不区分视频纹理,而且一般都采用单一的一般模型,能够对广泛的视频内容进行内插,然而,以往关于视频纹理分析和合成的工作表明,不同的纹理具有截然不同的动态特征,可以分为三类(静态、动态、连续和动态离散)。在这项工作中,我们研究了视频纹理对视频框内插的影响,并提出了一个新颖的框架,根据一种内插算法,对不同的模型进行不同纹理方面的培训。我们的研究显示,视频纹理对框架内插模型的性能有重大影响,而且将不同的模型专门用于这些纹理类,而不是培训一个试图学习一般运动的单一模型是有益的。我们的结果表明,使用我们框架进行微调的模型在所使用的测试集中平均实现了PSNR的0.3dB收益。