【推荐】全卷积语义分割综述

2017 年 8 月 31 日 机器学习研究会
【推荐】全卷积语义分割综述


点击上方 “机器学习研究会”可以订阅
摘要
 

转自:爱可可-爱生活

Semantic Segmentation

Introduction

Semantic Segmentation of an image is to assign each pixel in the input image a semantic class in order to get a pixel-wise dense classification. While semantic segmentation / scene parsing has been a part of the computer vision community since 2007, but much like other areas in computer vision, major breakthrough came when fully convolutional neural networks were first used by 2014 Long et. al. to perform end-to-end segmentation of natural images.

Figure : Example of semantic segmentation (Left) generated by FCN-8s ( trained using pytorch-semseg repository) overlayed on the input image (Right)

The FCN-8s architecture put forth achieved a 20% relative improvement to 62.2% mean IU on Pascal VOC 2012 dataset. This architecture was in my opinion a baseline for semantic segmentation on top of which several newer and better architectures were developed.

Fully Convolutional Networks (FCNs) are being used for semantic segmentation of natural images, for multi-modal medical image analysis and multispectral satellite image segmentation. Very similar to deep classification networks like AlexNet, VGG, ResNet etc. there is also a large variety of deep architectures that perform semantic segmentation.

I summarize networks like FCN, SegNet, U-Net, FC-Densenet E-Net & Link-Net, RefineNet, PSPNet, Mask-RCNN, and some semi-supervised approaches like DecoupledNet and GAN-SShere and provide reference PyTorch and Keras (in progress) implementations for a number of them. In the last part of the post I summarize some popular datasets and visualize a few results with the trained networks.

Network Architectures

A general semantic segmentation architecture can be broadly thought of as an encoder network followed by a decoder network. The encoder is usually is a pre-trained classification network like VGG/ResNet followed by a decoder network. The decoder network/mechanism is mostly where these architectures differ. The task of the decoder is to semantically project the discriminative features (lower resolution) learnt by the encoder onto the pixel space (higher resolution) to get a dense classification.

Unlike classification where the end result of the very deep network ( i.e. the class presence probability) is the only important thing, semantic segmentation not only requires discrimination at pixel level but also a mechanism to project the discriminative features learnt at different stages of the encoder onto the pixel space. Different architectures employ different mechanisms (skip connections, pyramid pooling etc) as a part of the decoding mechanism.

A number of above architectures and loaders for datasets is available in PyTorch at:

  • PyTorch: meetshah1995/pytorch-semseg

A more formal summarization of semantic segmentation ( including recurrent style networks ) can also be found here


Fully Convolution Networks (FCNs)

CVPR 2015 Fully Convolutional Networks for Semantic Segmentation Arxiv


We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a novel architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves state-of-the-art segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012), NYUDv2, and SIFT Flow, while inference takes one third of a second for a typical image.

Figure : The FCN end-to-end dense prediction pipeline.

A few key features of networks of this type are:

  • The features are merged from different stages in the encoder which vary in coarseness of semantic information.

  • The upsampling of learned low resolution semantic feature maps is done using deconvolutions which are initialized with billinear interpolation filters.

  • Excellent example for knowledge transfer from modern classifier networks like VGG16, Alexnet to perform semantic segmentation


链接:

https://meetshah1995.github.io/semantic-segmentation/deep-learning/pytorch/visdom/2017/06/01/semantic-segmentation-over-the-years.html


原文链接:

https://m.weibo.cn/3193816967/4146423228075680

“完整内容”请点击【阅读原文】
↓↓↓
登录查看更多
17

相关内容

Explanation:网络。 Publisher:Wiley。 SIT: http://dblp.uni-trier.de/db/journals/networks/

Convolutional networks are powerful visual models that yield hierarchies of features. We show that convolutional networks by themselves, trained end-to-end, pixels-to-pixels, exceed the state-of-the-art in semantic segmentation. Our key insight is to build "fully convolutional" networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning. We define and detail the space of fully convolutional networks, explain their application to spatially dense prediction tasks, and draw connections to prior models. We adapt contemporary classification networks (AlexNet, the VGG net, and GoogLeNet) into fully convolutional networks and transfer their learned representations by fine-tuning to the segmentation task. We then define a novel architecture that combines semantic information from a deep, coarse layer with appearance information from a shallow, fine layer to produce accurate and detailed segmentations. Our fully convolutional network achieves state-of-the-art segmentation of PASCAL VOC (20% relative improvement to 62.2% mean IU on 2012), NYUDv2, and SIFT Flow, while inference takes one third of a second for a typical image.

0
3
下载
预览
小贴士
相关资讯
弱监督语义分割最新方法资源列表
专知
7+阅读 · 2019年2月26日
(TensorFlow)实时语义分割比较研究
机器学习研究会
9+阅读 · 2018年3月12日
ResNet, AlexNet, VGG, Inception:各种卷积网络架构的理解
全球人工智能
14+阅读 · 2017年12月17日
【推荐】ResNet, AlexNet, VGG, Inception:各种卷积网络架构的理解
机器学习研究会
16+阅读 · 2017年12月17日
计算机视觉近一年进展综述
机器学习研究会
6+阅读 · 2017年11月25日
【推荐】YOLO实时目标检测(6fps)
机器学习研究会
16+阅读 · 2017年11月5日
【推荐】视频目标分割基础
机器学习研究会
6+阅读 · 2017年9月19日
【推荐】深度学习目标检测全面综述
机器学习研究会
17+阅读 · 2017年9月13日
【推荐】GAN架构入门综述(资源汇总)
机器学习研究会
8+阅读 · 2017年9月3日
【推荐】深度学习目标检测概览
机器学习研究会
9+阅读 · 2017年9月1日
相关VIP内容
专知会员服务
32+阅读 · 2020年3月19日
专知会员服务
55+阅读 · 2020年3月19日
专知会员服务
189+阅读 · 2020年1月1日
注意力机制模型最新综述
专知会员服务
138+阅读 · 2019年10月20日
可解释推荐:综述与新视角
专知会员服务
72+阅读 · 2019年10月13日
[综述]深度学习下的场景文本检测与识别
专知会员服务
29+阅读 · 2019年10月10日
相关论文
Image Segmentation Using Deep Learning: A Survey
Shervin Minaee,Yuri Boykov,Fatih Porikli,Antonio Plaza,Nasser Kehtarnavaz,Demetri Terzopoulos
23+阅读 · 2020年1月15日
A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis
Jorge Agnese,Jonathan Herrera,Haicheng Tao,Xingquan Zhu
4+阅读 · 2019年10月21日
CoCoNet: A Collaborative Convolutional Network
Tapabrata Chakraborti,Brendan McCane,Steven Mills,Umapada Pal
5+阅读 · 2019年1月28日
AuxNet: Auxiliary tasks enhanced Semantic Segmentation for Automated Driving
Sumanth Chennupati,Ganesh Sistu,Senthil Yogamani,Samir Rawashdeh
4+阅读 · 2019年1月17日
Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation
Chenxi Liu,Liang-Chieh Chen,Florian Schroff,Hartwig Adam,Wei Hua,Alan Yuille,Li Fei-Fei
5+阅读 · 2019年1月10日
Fast and Accurate 3D Medical Image Segmentation with Data-swapping Method
Haruki Imai,Samuel Matzek,Tung D. Le,Yasushi Negishi,Kiyokuni Kawachiya
3+阅读 · 2018年12月19日
Wenhui Zhang,Tejas Mahale
3+阅读 · 2018年12月13日
Mathijs Schuurmans,Maxim Berman,Matthew B. Blaschko
5+阅读 · 2018年6月7日
Holger R. Roth,Hirohisa Oda,Xiangrong Zhou,Natsuki Shimizu,Ying Yang,Yuichiro Hayashi,Masahiro Oda,Michitaka Fujiwara,Kazunari Misawa,Kensaku Mori
10+阅读 · 2018年3月20日
Jonathan Long,Evan Shelhamer,Trevor Darrell
3+阅读 · 2015年3月8日
Top