The Wikipedia category graph serves as the taxonomic backbone for large-scale knowledge graphs like YAGO or Probase, and has been used extensively for tasks like entity disambiguation or semantic similarity estimation. Wikipedia's categories are a rich source of taxonomic as well as non-taxonomic information. The category 'German science fiction writers', for example, encodes the type of its resources (Writer), as well as their nationality (German) and genre (Science Fiction). Several approaches in the literature make use of fractions of this encoded information without exploiting its full potential. In this paper, we introduce an approach for the discovery of category axioms that uses information from the category network, category instances, and their lexicalisations. With DBpedia as background knowledge, we discover 703k axioms covering 502k of Wikipedia's categories and populate the DBpedia knowledge graph with additional 4.4M relation assertions and 3.3M type assertions at more than 87% and 90% precision, respectively.
翻译:维基百科分类图是YAGO或Probase等大规模知识图表的分类主干,并被广泛用于实体脱钩或语义相似性估计等任务。维基百科的分类是分类学和非分类学信息的丰富来源。例如,“德国科小说作者”类别编码了资源类型(Writer),以及他们的国籍(德国)和基因(Science Fiction)。文献中的几种方法在不充分利用其潜力的情况下利用了这一编码资料的一小部分。在本文件中,我们采用了一种方法来发现使用分类网络、类别实例及其分类法化资料的分类法。以DBpedia为背景知识,我们发现了703kaxiom,涵盖维基百科类别中的502k,并用另外的4.4M关系数据和3.3M型数据分别精确度超过87%和90%。