Process graph extraction (PGE) is a recently emerged interdiscipline between natural language processing and business process management, which aims to extract process graphs expressed in texts. Previous process extractors heavily depend on manual features and ignore the potential relations between clues of different text granularities. In this paper, we formalize the PGE task into the multi-granularity text classification problem, and propose a hierarchical model to effectively model and extract multi-granularity information without manually defined procedural knowledge. Under this framework, we accordingly propose the coarse-to-fine learning mechanism, training multi-granularity tasks in coarse-to-fine order to share the high-level knowledge for the low-level tasks. To evaluate our approach, we construct two finer-grained datasets from two sentence-level corpora and conduct extensive experiments from different dimensions. The experimental results demonstrate that our approach outperforms the state-of-the-art methods with statistical significance, and the ablation studies demonstrate its effectiveness.
翻译:过程图解提取(PGE)是自然语言处理和业务流程管理之间最近出现的一种跨学科现象,目的是提取文本中显示的流程图。以往的流程提取器在很大程度上依赖手动特征,忽视了不同文本颗粒线索之间的潜在关系。在本文中,我们将PGE的任务正式化为多语种文本分类问题,并提出一个等级模式,以便在没有手工界定的程序知识的情况下有效建模和提取多语种信息。在此框架下,我们相应提议采用粗略至细微学习机制,培训粗略至细微的多语种任务,以便分享低层次任务高级知识。我们从两个句级组合中建立两个精细的精细的数据集,并进行不同层面的广泛实验。实验结果表明,我们的方法超越了具有统计意义的状态方法,而通融研究则表明其有效性。