深结构图像分割预测模型中的复杂关系 (Complex Relations in a Deep Structured Prediction Model for Fine Image Segmentation)

Many deep learning architectures for semantic segmentation involve a Fully Convolutional Neural Network (FCN) followed by a Conditional Random Field (CRF) to carry out inference over an image. These models typically involve unary potentials based on local appearance features computed by FCNs, and binary potentials based on the displacement between pixels. We show that while current methods succeed in segmenting whole objects, they perform poorly in situations involving a large number of object parts. We therefore suggest incorporating into the inference algorithm additional higher-order potentials inspired by the way humans identify and localize parts. We incorporate two relations that were shown to be useful to human object identification - containment and attachment - into the energy term of the CRF and evaluate their performance on the Pascal VOC Parts dataset. Our experimental results show that the segmentation of fine parts is positively affected by the addition of these two relations, and that the segmentation of fine parts can be further influenced by complex structural features.

翻译：许多关于语义分解的深层次学习结构涉及一个全面进化神经网络,然后是有条件随机场,对图像进行推断。这些模型通常涉及基于FCN所计算的当地外观特征的单一潜力,以及基于像素之间偏移的二进制潜力。我们表明,虽然目前的方法成功地分解了整个物体,但在涉及大量物体组成部分的情况下,它们的表现很差。因此,我们建议在推理算法中增加人类识别和定位部分的方式所激发的更高顺序潜力。我们把两种关系,我们证明对识别人类物体有用——封隔和附加——纳入通用报告格式的能源术语中,并评价其在Pascal VOC部件数据集上的性能。我们的实验结果表明,细部分的分解因添加这两个部分而受到积极的影响,细部分的分解可能受到复杂结构特征的进一步影响。

相关内容

条件随机场

关注 341

条件随机域（场）（conditional random fields，简称 CRF，或CRFs），是一种判别式概率模型，是随机场的一种，常用于标注或分析序列资料，如自然语言文字或是生物序列。如同马尔可夫随机场，条件随机场为具有无向的图模型，图中的顶点代表随机变量，顶点间的连线代表随机变量间的相依关系，在条件随机场中，随机变量 Y 的分布为条件机率，给定的观察值则为随机变量 X。原则上，条件随机场的图模型布局是可以任意给定的，一般常用的布局是链结式的架构，链结式架构不论在训练（training）、推论（inference）、或是解码（decoding）上，都存在效率较高的算法可供演算。

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日