基于视频的无监督segmentation论文推荐

2019 年 4 月 23 日 CreateAMind



https://github.com/aimerykong/predictive-filter-flow





Multigrid Predictive Filter Flow for Unsupervised Learning on Videos

Shu Kong, Charless Fowlkes

Last update: March 24, 2019.

We introduce multigrid Predictive Filter Flow (mgPFF), a framework for unsupervised learning on videos. The mgPFF takes as input a pair of frames and outputs per-pixel filters to warp one frame to the other. Compared to optical flow used for warping frames, mgPFF is more powerful in modeling sub-pixel movement and dealing with corruption (e.g., motion blur). We develop a multigrid coarse-to-fine modeling strategy that avoids the requirement of learning large filters to capture large displacement. This allows us to train an extremely compact model (4.6MB) which operates in a progressive way over multiple resolutions with shared weights. We train mgPFF on unsupervised, free-form videos and show that mgPFF is able to not only estimate long-range flow for frame reconstruction and detect video shot transitions, but also readily amendable for video object segmentation and pose tracking, where it substantially outperforms the published state-of-the-art without bells and whistles. Moreover, owing to mgPFF's nature of per-pixel filter prediction, we have the unique opportunity to visualize how each pixel is evolving during solving these tasks, thus gaining better interpretability.

keywords: Unsupervised Learning, Multigrid Computing, Long-Range Flow, Video Segmentation, Instance Tracking, Pose Tracking, Video Shot/Transition Detection, Optical Flow, Filter Flow, Low-level Vision,

    S. Kong, C. Fowlkes, "Multigrid Predictive Filter Flow for Unsupervised Learning on Videos", arXiv 1904.01693, 2019. 
    [project page] [paper] [demo] [github] [slides] [poster]



Acknowledgements: This project is supported by NSF grants IIS-1813785, IIS-1618806, IIS-1253538 and a hardware donation from NVIDIA. Shu Kong personally thanks Teng Liu and Etthew Kong who initiated this research, and the academic uncle Alexei A. Efros for the encouragement and discussion. 





Image Reconstruction with Predictive Filter Flow

Shu Kong, Charless Fowlkes

Last update: Nov. 28, 2018.

We propose a simple, interpretable framework for solving a wide range of image reconstruction problems such as denoising and deconvolution. Given a corrupted input image, the model synthesizes a spatially varying linear filter which, when applied to the input image, reconstructs the desired output. The model parameters are learned using supervised or self-supervised training. We test this model on three tasks: non-uniform motion blur removal, lossy-compression artifact reduction and single image super resolution. We demonstrate that our model substantially outperforms state-of-the-art methods on all these tasks and is significantly faster than optimization-based approaches to deconvolution. Unlike models that directly predict output pixel values, the predicted filter flow is controllable and interpretable, which we demonstrate by visualizing the space of predicted filters for different tasks.

keywords: inverse problem, spatially-variant blind deconvolution, low-level vision, non-uniform motion blur removal, compression artifact removal, single image super-resolution, filter flow, interpretable model, per-pixel twist, self-supervised learning, image distribution learning.

  • S. Kong, C. Fowlkes, "Image Reconstruction with Predictive Filter Flow", arXiv:1811.11482, 2018. 
    [project page] [high-res paper (44MB)] [github] [demo] [models] [slides] [poster]





https://github.com/aimerykong/predictive-filter-flow




年薪百万来奋斗-骥智CreateAMind2019招聘目标:年薪百万招聘大牛50+  推荐成功送mate20

登录查看更多
0

相关内容

专知会员服务
59+阅读 · 2020年3月19日
专知会员服务
109+阅读 · 2020年3月12日
【深度学习视频分析/多模态学习资源大列表】
专知会员服务
89+阅读 · 2019年10月16日
强化学习最新教程,17页pdf
专知会员服务
166+阅读 · 2019年10月11日
[综述]深度学习下的场景文本检测与识别
专知会员服务
76+阅读 · 2019年10月10日
弱监督语义分割最新方法资源列表
专知
9+阅读 · 2019年2月26日
无监督元学习表示学习
CreateAMind
25+阅读 · 2019年1月4日
【泡泡一分钟】基于视频修复的时空转换网络
泡泡机器人SLAM
5+阅读 · 2018年12月30日
(TensorFlow)实时语义分割比较研究
机器学习研究会
9+阅读 · 2018年3月12日
语义分割+视频分割开源代码集合
极市平台
35+阅读 · 2018年3月5日
【推荐】视频目标分割基础
机器学习研究会
9+阅读 · 2017年9月19日
UPSNet: A Unified Panoptic Segmentation Network
Arxiv
3+阅读 · 2019年1月12日
Arxiv
8+阅读 · 2018年5月15日
VIP会员
相关VIP内容
专知会员服务
59+阅读 · 2020年3月19日
专知会员服务
109+阅读 · 2020年3月12日
【深度学习视频分析/多模态学习资源大列表】
专知会员服务
89+阅读 · 2019年10月16日
强化学习最新教程,17页pdf
专知会员服务
166+阅读 · 2019年10月11日
[综述]深度学习下的场景文本检测与识别
专知会员服务
76+阅读 · 2019年10月10日
Top
微信扫码咨询专知VIP会员