Sentence-level relation extraction (RE) aims at identifying the relationship between two entities in a sentence. Many efforts have been devoted to this problem, while the best performing methods are still far from perfect. In this paper, we revisit two problems that affect the performance of existing RE models, namely entity representation and noisy or ill-defined labels. Our improved baseline model, incorporated with entity representations with typed markers, achieves an F1 of 74.6% on TACRED, significantly outperforms previous SOTA methods. Furthermore, the presented new baseline achieves an F1 of 91.1% on the refined Re-TACRED dataset, demonstrating that the pre-trained language models achieve unexpectedly high performance on this task. We release our code to the community for future research.
翻译:句级关系提取(RE)旨在确定一个句子中两个实体之间的关系。许多努力都致力于解决这一问题,而最佳运作方法还远非完美。在本文件中,我们重新审视了影响现有RE模型绩效的两个问题,即实体代表性和吵闹或定义不清的标签。我们改进后的基线模型与标语标码的实体代表相结合,在TACRED上实现了74.6%的F1,大大超过以前的SOTA方法。此外,提出的新基线在经过改进的RETCRED数据集上取得了91.1%的F1,表明预先培训的语言模型在这项任务上取得了出人意料的高绩效。我们向社区发布了我们的代码,供今后研究。