We propose the Wasserstein-Fourier (WF) distance to measure the (dis)similarity between time series by quantifying the displacement of their energy across frequencies. The WF distance operates by calculating the Wasserstein distance between the (normalised) power spectral densities (NPSD) of time series. Yet this rationale has been considered in the past, we fill a gap in the open literature providing a formal introduction of this distance, together with its main properties from the joint perspective of Fourier analysis and optimal transport. As the main aim of this work is to validate WF as a general-purpose metric for time series, we illustrate its applicability on three broad contexts. First, we rely on WF to implement a PCA-like dimensionality reduction for NPSDs which allows for meaningful visualisation and pattern recognition applications. Second, we show that the geometry induced by WF on the space of NPSDs admits a geodesic interpolant between time series, thus enabling data augmentation on the spectral domain, by averaging the dynamic content of two signals. Third, we implement WF for time series classification using parametric/non-parametric classifiers and compare it to other classical metrics. Supported on theoretical results, as well as synthetic illustrations and experiments on real-world data, this work establishes WF as a meaningful and capable resource pertinent to general distance-based applications of time series.
翻译:我们建议用瓦西斯坦-福里埃(WF)距离来测量时间序列之间的(不同)差异,办法是量化其能量在频率之间的迁移。WF距离通过计算时间序列(正常)光谱密度(NPSD)之间的瓦西斯坦距离来计算时间序列(瓦西斯坦-福里埃(WF)距离。然而,过去曾考虑过这一理由,我们填补了开放文献中的一个空白,从Fourier分析与最佳运输的共同角度正式引入这一距离及其主要特性。由于这项工作的主要目的是验证WF作为时间序列的通用指标,我们展示了它在三大背景下的适用性。首先,我们依靠WFF对核动力源(PISD)(SD)(NPSD)(NPSD)之间的瓦西斯坦距离进行了类似于CPA的减少。第二,我们表明,WF在核动力源数据库空间的几何构造中承认时间序列之间的地理分解,从而通过对两种信号的动态内容进行等量化,从而使得光谱域的数据得以增强。第三,我们用对时间序列进行时间序列的分类,使用参数/非参数/非参数序列,将这种理论级数据作为基础数据作为基础数据模拟数据模拟数据与模型的模型的模型的模型的模型,用来比较。