How to effectively explore the colors of reference exemplars and propagate them to colorize each frame is vital for exemplar-based video colorization. In this paper, we present an effective BiSTNet to explore colors of reference exemplars and utilize them to help video colorization by a bidirectional temporal feature fusion with the guidance of semantic image prior. We first establish the semantic correspondence between each frame and the reference exemplars in deep feature space to explore color information from reference exemplars. Then, to better propagate the colors of reference exemplars into each frame and avoid the inaccurate matches colors from exemplars we develop a simple yet effective bidirectional temporal feature fusion module to better colorize each frame. We note that there usually exist color-bleeding artifacts around the boundaries of the important objects in videos. To overcome this problem, we further develop a mixed expert block to extract semantic information for modeling the object boundaries of frames so that the semantic image prior can better guide the colorization process for better performance. In addition, we develop a multi-scale recurrent block to progressively colorize frames in a coarse-to-fine manner. Extensive experimental results demonstrate that the proposed BiSTNet performs favorably against state-of-the-art methods on the benchmark datasets. Our code will be made available at \url{https://yyang181.github.io/BiSTNet/}
翻译:如何有效探索参考模版的颜色, 并传播它们来给每个框架颜色进行颜色化。 在本文中, 我们展示一个有效的 BiSTNet 来探索参考模版的颜色, 并使用它们来帮助视频颜色化, 其方法是双向时间特征结合, 之前的语义图像指导。 我们首先在深度空间中建立每个框架和参考模版之间的语义对应, 以便从引用模版中探索颜色信息 。 然后, 更好地向每个框架传播参考模版的颜色, 并避免Exemplars 提供的不准确匹配颜色。 我们开发了一个简单而有效的双向时间特征融合模块, 以更好地给每个框架颜色化。 我们注意到, 在视频中重要对象的边界周围, 通常存在颜色分解的工艺。 为了克服这个问题, 我们进一步开发一个混合的专家区块, 以提取用于构建框架对象界限模型的语义信息。 之前的语义图像将更好地指导颜色化进程。 此外, 我们开发一个多尺度且有效的双向时间点的双向时间点组合组合组合组合组合组合组合组合模块, 以演示一个多尺度的Stial- 将一个在线数据库的模型, 演示一个在线数据库。