Complex Event Recognition (CER) systems are a prominent technology for finding user-defined query patterns over large data streams in real time. CER query evaluation is known to be computationally challenging, since it requires maintaining a set of partial matches, and this set quickly grows super-linearly in the number of processed events. We present CORE, a novel COmplex event Recognition Engine that focuses on the efficient evaluation of a large class of complex event queries, including time windows as well as the partition-by event correlation operator. This engine uses a novel automaton-based evaluation algorithm that circumvents the super-linear partial match problem: under data complexity, it takes constant time per input event to maintain a data structure that compactly represents the set of partial matches and, once a match is found, the query results may be enumerated from the data structure with output-linear delay. We experimentally compare CORE against state-of-the-art CER systems on real-world data. We show that (1) CORE's performance is stable with respect to both query and time window size, and (2) CORE outperforms the other systems by up to five orders of magnitude on different workloads.
翻译:复杂事件识别(CER)系统是实时找到大型数据流用户定义查询模式的突出技术。 CER 查询评估在计算上具有挑战性, 因为需要保持一组部分匹配, 而该数据集在所处理的事件数量中迅速增长超线。 我们展示了一个新的CORE, 一个新的COmplex事件识别引擎, 重点是有效评估一大批复杂事件查询, 包括时间窗口和分离事件相关操作员。 这个引擎使用新的基于自动图的评估算法, 绕过超级线性部分匹配问题: 在数据复杂度下, 需要每个输入事件固定的时间来维持一个数据结构, 以压缩代表部分匹配的一组, 一旦找到匹配, 查询结果可以从数据结构中列出, 产出线性延迟 。 我们实验性地将CORE 与现实世界数据上最先进的CER 系统进行比较。 我们显示:(1) CORE 的性能在查询和时间窗口大小上都稳定, 并且(2) CORE 超越其他系统, 在不同工作量上达到五级级。