使用子空间代表和微缩分解法进行图像分割 (Image Segmentation Using Subspace Representation and Sparse Decomposition)

Image foreground extraction is a classical problem in image processing and vision, with a large range of applications. In this dissertation, we focus on the extraction of text and graphics in mixed-content images, and design novel approaches for various aspects of this problem. We first propose a sparse decomposition framework, which models the background by a subspace containing smooth basis vectors, and foreground as a sparse and connected component. We then formulate an optimization framework to solve this problem, by adding suitable regularizations to the cost function to promote the desired characteristics of each component. We present two techniques to solve the proposed optimization problem, one based on alternating direction method of multipliers (ADMM), and the other one based on robust regression. Promising results are obtained for screen content image segmentation using the proposed algorithm. We then propose a robust subspace learning algorithm for the representation of the background component using training images that could contain both background and foreground components, as well as noise. With the learnt subspace for the background, we can further improve the segmentation results, compared to using a fixed subspace. Lastly, we investigate a different class of signal/image decomposition problem, where only one signal component is active at each signal element. In this case, besides estimating each component, we need to find their supports, which can be specified by a binary mask. We propose a mixed-integer programming problem, that jointly estimates the two components and their supports through an alternating optimization scheme. We show the application of this algorithm on various problems, including image segmentation, video motion segmentation, and also separation of text from textured images.

翻译：地面提取图像是图像处理和视觉的典型问题, 其应用范围很广。在这项论文中, 我们侧重于在混合内容图像中提取文本和图形, 并针对这一问题的各个方面设计新的方法。我们首先提出一个稀疏的分解框架, 通过一个包含光滑的基矢量的子空间来模拟背景, 将前景作为稀疏和连接的组件来模拟背景。然后我们制定一个最优化框架来解决这个问题, 通过在成本函数中添加适当的正规化来促进每个组件的预期特性。我们提出了两种解决拟议优化问题的方法, 一种基于乘数图像的交替方向方法( ADMMM ), 另一种基于稳健的回归方法。我们先用拟议的算法为屏幕内容的图像分解设计出新的结果。然后我们提出一个强大的次空间学习算法, 使用培训图像既包含背景内容, 也包含背景组成部分, 以及噪音来解决这个问题, 我们还可以进一步改进分解结果, 而不是使用固定的子空间。最后, 我们用一个特定的方法, 来调查一个不同的信号分解的分类, 包括每个图像的分解的分解。