Shapelet is a discriminative subsequence of time series. An advanced shapelet-based method is to embed shapelet into accurate and fast random forest. However, it shows several limitations. First, random shapelet forest requires a large training cost for split threshold searching. Second, a single shapelet provides limited information for only one branch of the decision tree, resulting in insufficient accuracy and interpretability. Third, randomized ensemble causes interpretability declining. For that, this paper presents Random Pairwise Shapelets Forest (RPSF). RPSF combines a pair of shapelets from different classes to construct random forest. It omits threshold searching to be more efficient, includes more information for each node of the forest to be more effective. Moreover, a discriminability metric, Decomposed Mean Decrease Impurity (DMDI), is proposed to identify influential region for every class. Extensive experiments show RPSF improves the accuracy and training speed of shapelet-based forest. Case studies demonstrate the interpretability of our method.
翻译:形状是时间序列中具有歧视性的子序列。 基于形状的先进方法是将形状嵌入精确和快速随机森林。 但是,它显示出若干限制。 首先, 随机形状森林需要大量的培训费用才能进行分界阈值搜索。 其次, 单一形状只为决策树的一个分支提供有限的信息, 导致不准确和可解释性。 第三, 随机的共通性导致解释性下降。 为此, 本文展示了随机的Pairwides 形状森林( RPSF) 。 RPSF 组合了来自不同种类的一对形状, 以构建随机森林。 它省略了临界值搜索以提高效率, 包括森林每个节点的更多信息, 以便更加有效。 此外, 提议采用一个不相容性指标, 解析的中值减少不规则(DDDDI) 来识别每一类的有影响的区域。 广泛实验显示 RPSFFFPSF提高了形状森林的准确性和训练速度。 案例研究表明我们的方法可以解释。