Argument mining is generally performed on the sentence-level -- it is assumed that an entire sentence (not parts of it) corresponds to an argument. In this paper, we introduce the new task of Argument unit Recognition and Classification (ARC). In ARC, an argument is generally a part of a sentence -- a more realistic assumption since several different arguments can occur in one sentence and longer sentences often contain a mix of argumentative and non-argumentative parts. Recognizing and classifying the spans that correspond to arguments makes ARC harder than previously defined argument mining tasks. We release ARC-8, a new benchmark for evaluating the ARC task. We show that token-level annotations for argument units can be gathered using scalable methods. ARC-8 contains 25\% more arguments than a dataset annotated on the sentence-level would. We cast ARC as a sequence labeling task, develop a number of methods for ARC sequence tagging and establish the state of the art for ARC-8. A focus of our work is robustness: both robustness against errors in sentence identification (which are frequent for noisy text) and robustness against divergence in training and test data.
翻译:参数采矿一般在句级上进行 -- 假设整个句子(不是其部分)与一个参数相对应。在本文件中,我们介绍了参数单位识别和分类的新任务。在ARC中,一个参数一般是一个句子的一部分 -- -- 一个更现实的假设,因为几个不同的参数可以在一个句子中出现,较长的句子往往包含一个参数和非参数部分的组合。承认和分类符合参数的跨度使ACC比先前定义的参数采矿任务更难。我们发布了ACC-8,这是评估ACC任务的新基准。我们显示,可以用可缩放的方法收集参数单位的代号级说明。ARC-8包含比句级附加说明的数据集更多的25 ⁇ 。我们把ACC作为顺序标签任务,为AR定出若干顺序标记方法,并确定ARC-8的艺术状态。我们工作的重点是稳健性:既能稳健性,能防止判决识别错误(为敏感文字频繁),又能稳健性防止培训和测试数据的差异。