In machine learning field, dimensionality reduction is one of the important tasks. It mitigates the undesired properties of high-dimensional spaces to facilitate classification, compression, and visualization of high-dimensional data. During the last decade, researchers proposed a large number of new (nonlinear) techniques for dimensionality reduction. Most of these techniques are based on the intuition that data lies on or near a complex low-dimensional manifold that is embedded in the high-dimensional space. New techniques for dimensionality reduction aim at identifying and extracting the manifold from the high-dimensional space. Isomap is one of widely-used low-dimensional embedding methods, where geodesic distances on a weighted graph are incorporated with the classical scaling (metric multidimensional scaling). Isomap chooses the nearest neighbors based on the distance only which causes bridges and topological instability. In this paper we pay our attention to topological stability that was not considered in Isomap.because at any point on the manifold , that point and its nearest neighbors forms a vector subspace and the orthogonal to that subspace is orthogonal to all vectors spans the vector subspace. Our approach uses the point itself and its two nearest neighbors to find the bases of the subspace and the orthogonal to that subspace which belongs to the orthogonal complementary subspace. Our approach then adds new points to the two nearest neighbors based on the distance and the angle between each new point and the orthogonal to the subspace. The superior performance of the new approach in choosing the nearest neighbors is confirmed through experimental work with several datasets.
翻译:在机器学习字段中, 减少维度是一个重要的任务之一。 它会减轻高维空间的不理想特性, 以便于对高维数据进行分类、 压缩和直观化。 在过去十年中, 研究人员提出了大量新的( 非线性) 技术来减少维度。 这些技术大多基于数据存在于高维空间或接近高维空间所嵌入的复杂低维多元体的直觉。 减少维度新技术的目的是从高维空间中识别和提取元件。 Isomap 是广泛使用的低维嵌入方法之一, 用于便利高维数据的分类、 压缩和可视化。 过去十年中, 研究人员提出了大量新的非线性( 非线性) 来减少维度技术。 在本文中, 我们关注高维空间中不考虑的地形稳定性。 在多维空间的任何点上, 该点及其最近的邻系形成一个矢量子空间, 该次空间的子空间是或直线性低维度的低位空间, 选择每个直径水平的上方位值的上方位值, 以及最近端的矢量的矢量和直径空基的次空间, 将我们的矢中, 将每个矢量定位的次空间的次空间的次间点定位到最近的矢量定位到最近的次空间到最近的基点, 。