基于显著性图引导的知识发现与LLM符号近似用于子类识别 (Saliency Map-Guided Knowledge Discovery for Subclass Identification with LLM-Based Symbolic Approximations)

This paper proposes a novel neuro-symbolic approach for sensor signal-based knowledge discovery, focusing on identifying latent subclasses in time series classification tasks. The approach leverages gradient-based saliency maps derived from trained neural networks to guide the discovery process. Multiclass time series classification problems are transformed into binary classification problems through label subsumption, and classifiers are trained for each of these to yield saliency maps. The input signals, grouped by predicted class, are clustered under three distinct configurations. The centroids of the final set of clusters are provided as input to an LLM for symbolic approximation and fuzzy knowledge graph matching to discover the underlying subclasses of the original multiclass problem. Experimental results on well-established time series classification datasets demonstrate the effectiveness of our saliency map-driven method for knowledge discovery, outperforming signal-only baselines in both clustering and subclass identification.

翻译：本文提出了一种新颖的神经符号方法，用于基于传感器信号的知识发现，重点在于识别时间序列分类任务中的潜在子类。该方法利用从训练好的神经网络中提取的基于梯度的显著性图来引导发现过程。通过标签归约将多类时间序列分类问题转化为二分类问题，并为每个问题训练分类器以生成显著性图。输入信号按预测类别分组，并在三种不同配置下进行聚类。最终聚类集合的质心被输入到大型语言模型（LLM）中，用于符号近似和模糊知识图匹配，以发现原始多类问题的潜在子类。在经典时间序列分类数据集上的实验结果表明，我们的显著性图驱动方法在知识发现方面具有显著效果，在聚类和子类识别任务上均优于仅基于信号的基线方法。