CVPR is the premier annual computer vision event comprising the main conference and several co-located workshops and short courses. With its high quality and low cost, it provides an exceptional value for students, academics and industry researchers. CVPR 2020 will take place at The Washington State Convention Center in Seattle, WA, from June 16 to June 20, 2020.


该文是清华大学&华为诺亚方舟联合提出的一种视频超分方案。在图像/视频质量改善领域,每年都会出现大量的paper,但真正值得深入研究的并不多。恰好该文是视频超分领域非常不错的文章之一,它在指标方面取得了媲美甚至优于EDVR的效果且并未使用形变卷积。所以这篇论文值得各位花点时间去了解一下。 视频超分旨在根据低分辨率视频生成高分辨率且更优视觉效果的视频,目前它引起了越来越多的关注。在这篇论文中,作者提出一种采用分层方式利用时序信息的方法。输入序列被分为多个组,不同组对应不同的帧率,这些组为参考帧重建遗失细节提供了互补信息,与此同时,还集成了注意力模块与组间融合模块。此外,作者还引入一种快速空域对齐以处理视频的大位移运动。


  • 提出一种新颖的神经网络,它可以通过帧率分组分层方式有效的融合空时信息;
  • 引入一种快速空域对齐方法处理大运动问题;
  • 所提方法在两个主流视频超分基准数据集上取得了SOTA性能



This paper studies the problem of learning semantic segmentation from image-level supervision only. Current popular solutions leverage object localization maps from classifiers as supervision signals, and struggle to make the localization maps capture more complete object content. Rather than previous efforts that primarily focus on intra-image information, we address the value of cross-image semantic relations for comprehensive object pattern mining. To achieve this, two neural co-attentions are incorporated into the classifier to complimentarily capture cross-image semantic similarities and differences. In particular, given a pair of training images, one co-attention enforces the classifier to recognize the common semantics from co-attentive objects, while the other one, called contrastive co-attention, drives the classifier to identify the unshared semantics from the rest, uncommon objects. This helps the classifier discover more object patterns and better ground semantics in image regions. In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference, hence eventually benefiting semantic segmentation learning. More essentially, our algorithm provides a unified framework that handles well different WSSS settings, i.e., learning WSSS with (1) precise image-level supervision only, (2) extra simple single-label data, and (3) extra noisy web data. It sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability. Moreover, our approach ranked 1st place in the Weakly-Supervised Semantic Segmentation Track of CVPR2020 Learning from Imperfect Data Challenge.