A time series is a sequence of sequentially ordered real values in time. Time series classification (TSC) is the task of assigning a time series to one of a set of predefined classes, usually based on a model learned from examples. Dictionary-based methods for TSC rely on counting the frequency of certain patterns in time series and are important components of the currently most accurate TSC ensembles. One of the early dictionary-based methods was WEASEL, which at its time achieved SotA results while also being very fast. However, it is outperformed both in terms of speed and accuracy by other methods. Furthermore, its design leads to an unpredictably large memory footprint, making it inapplicable for many applications. In this paper, we present WEASEL 2.0, a complete overhaul of WEASEL based on two recent advancements in TSC: Dilation and ensembling of randomized hyper-parameter settings. These two techniques allow WEASEL 2.0 to work with a fixed-size memory footprint while at the same time improving accuracy. Compared to 15 other SotA methods on the UCR benchmark set, WEASEL 2.0 is significantly more accurate than other dictionary methods and not significantly worse than the currently best methods. Actually, it achieves the highest median accuracy over all data sets, and it performs best in 5 out of 12 problem classes. We thus believe that WEASEL 2.0 is a viable alternative for current TSC and also a potentially interesting input for future ensembles.
翻译:时间序列是按顺序顺序排列的实时实际值序列。 时间序列分类( TSC) 的任务是将一个时间序列分配给一组预定义的类别, 通常是根据从实例中吸取的模型。 TSC 的字典方法依赖于计算时间序列中某些模式的频率, 并且是当前最准确的 TSC 组合的重要组成部分。 早期字典方法之一是WESEL, 它在时间里实现了 SotA结果, 同时也是非常快的。 但是, 时间序列分类( TSC) 在速度和其他方法的准确性上都比它的速度序列要快。 此外, 它的设计导致一个无法预测的大型记忆足迹, 使得它不适用于许多应用程序。 在本文件中, 我们介绍WESEL 2. 0, 全面修改WESEL, 这是基于当前 TSC 的两项最新进展: 关系和随机化的超标度设置。 这两种技术使WESEL 2. 0 能够同时用固定的记忆足迹工作, 同时提高准确性。 与 UCR 的 SotA 方法比其他15 更糟糕的方法, 使它现在的 TOSASEL 标准的精确性标准比 12 中所有最精确的方法。