The "curse of dimensionality" is a well-known problem in pattern recognition. A widely used approach to tackling the problem is a group of subspace methods, where the original features are projected onto a new space. The lower dimensional subspace is then used to approximate the original features for classification. However, most subspace methods were not originally developed for classification. We believe that direct adoption of these subspace methods for pattern classification should not be considered best practice. In this paper, we present a new information theory based algorithm for selecting subspaces, which can always result in superior performance over conventional methods. This paper makes the following main contributions: i) it improves a common practice widely used by practitioners in the field of pattern recognition, ii) it develops an information theory based technique for systematically selecting the subspaces that are discriminative and therefore are suitable for pattern recognition/classification purposes, iii) it presents extensive experimental results on a variety of computer vision and pattern recognition tasks to illustrate that the subspaces selected based on maximum mutual information criterion will always enhance performance regardless of the classification techniques used.
翻译:“维度的极限”是一个在模式识别方面广为人知的问题。一种广泛使用的解决问题的方法是一组子空间方法,其原始特征被投射到新的空间中。然后,低维子空间被用来接近最初的分类特征。然而,大多数子空间方法最初不是为分类而开发的。我们认为,直接采用这些子空间方法进行模式分类不应被视为最佳做法。在本文件中,我们为选择子空间提出了一种新的基于信息理论的算法,这种算法总是能够产生优于常规方法的性能。本文件作出了以下主要贡献:i)它改进了在模式识别领域从业人员广泛使用的一种共同做法;ii)它开发了一种基于信息理论的技术,用于系统选择具有歧视性的子空间,因此适合于模式识别/分类目的;iii)它介绍了各种计算机视觉和模式识别任务的广泛实验结果,以说明根据最大相互信息标准选定的子空间将始终提高性能,而不论所使用的分类技术如何。