This book introduces the new research area of Geometric Data Science, where data can represent any real objects through geometric measurements. The first part of the book focuses on finite point sets. The most important result is a complete and continuous classification of all finite clouds of unordered points under rigid motion in any Euclidean space. The key challenge was to avoid the exponential complexity arising from permutations of the given unordered points. For a fixed dimension of the ambient Euclidean space, the times of all algorithms for the resulting invariants and distance metrics depend polynomially on the number of points. The second part of the book advances a similar classification in the much more difficult case of periodic point sets, which model all periodic crystals at the atomic scale. The most significant result is the hierarchy of invariants from the ultra-fast to complete ones. The key challenge was to resolve the discontinuity of crystal representations that break down under almost any noise. Experimental validation on all major materials databases confirmed the Crystal Isometry Principle: any real periodic crystal has a unique location in a common moduli space of all periodic structures under rigid motion. The resulting moduli space contains all known and not yet discovered periodic crystals and hence continuously extends Mendeleev's table to the full crystal universe.
翻译:本书介绍了一个新的研究领域——几何数据科学,其中数据可通过几何测量表示任何真实物体。本书第一部分聚焦于有限点集。最重要的成果是对任意欧几里得空间中刚性运动下所有无序点的有限云点集进行了完整且连续的分类。关键挑战在于避免由给定无序点的排列组合引起的指数级复杂度。对于固定维度的环境欧几里得空间,所有算法针对所得不变量和距离度量的运行时间均与点数呈多项式关系。本书第二部分在更为困难的周期性点集情形下推进了类似分类,该模型在原子尺度上描述了所有周期性晶体。最显著的成果是从超快速到完整不变量构成的层次结构。关键挑战在于解决晶体表示在几乎任何噪声下失效所导致的不连续性。在所有主要材料数据库上的实验验证证实了晶体等距原理:任何真实周期性晶体在刚性运动下所有周期性结构的公共模空间中具有唯一位置。所得模空间包含所有已知及尚未发现的周期性晶体,从而将门捷列夫周期表连续扩展至完整的晶体宇宙。