Conditional Random Field (CRF) based neural models are among the most performant methods for solving sequence labeling problems. Despite its great success, CRF has the shortcoming of occasionally generating illegal sequences of tags, e.g. sequences containing an "I-" tag immediately after an "O" tag, which is forbidden by the underlying BIO tagging scheme. In this work, we propose Masked Conditional Random Field (MCRF), an easy to implement variant of CRF that impose restrictions on candidate paths during both training and decoding phases. We show that the proposed method thoroughly resolves this issue and brings consistent improvement over existing CRF-based models with near zero additional cost.
翻译:以有条件随机场为基础的神经模型是解决序列标签问题最有效果的方法之一。尽管它取得了巨大成功,但通用报告格式的缺点是偶尔产生非法的标签序列,例如在“O”标记之后立即含有“I”标记的序列,这是基本BIO标签办法所禁止的。在这项工作中,我们提议采用蒙面随机场(MCRF),这是易于执行的通用报告格式变量,在培训和解码阶段对候选人路径施加限制。我们表明,拟议的方法彻底解决这一问题,并以近乎零额外费用的方式持续改进基于通用报告格式的现有模型。