Topic modeling algorithms traditionally model topics as list of weighted terms. These topic models can be used effectively to classify texts or to support text mining tasks such as text summarization or fact extraction. The general procedure relies on statistical analysis of term frequencies. The focus of this work is on the implementation of the knowledge-based topic modelling services in a KNIME workflow. A brief description and evaluation of the DBPedia-based enrichment approach and the comparative evaluation of enriched topic models will be outlined based on our previous work. DBpedia-Spotlight is used to identify entities in the input text and information from DBpedia is used to extend these entities. We provide a workflow developed in KNIME implementing this approach and perform a result comparison of topic modeling supported by knowledge base information to traditional LDA. This topic modeling approach allows semantic interpretation both by algorithms and by humans.
翻译:这些专题模型可以有效地用于对文本进行分类,或支持文本开采任务,如文本摘要或事实提取等。一般程序依赖于对术语频率的统计分析。这项工作的重点是在KNIME工作流程中实施基于知识的专题建模服务。将根据我们以前的工作概述对基于DBBedia的浓缩方法的简要说明和评价以及对丰富专题模型的比较评价。DBpedia-Spotlight用于确定输入文本中的实体,DBpedia的信息用于扩展这些实体。我们提供了KNIME实施这一方法的工作流程,并对由知识基础信息支持的专题建模与传统的LDA进行了结果比较。这个专题建模方法可以由算法和人类进行语义解释。