The extraction and use of diverse knowledge from numerous documents is a pressing challenge in intelligent information retrieval. Documents contain elements that require different recognition methods. Table recognition typically consists of three subtasks, namely table structure, cell position and cell content recognition. Recent models have achieved excellent recognition with a combination of multi-task learning, local attention, and mutual learning. However, their effectiveness has not been fully explained, and they require a long period of time for inference. This paper presents a novel multi-task model that utilizes non-causal attention to capture the entire table structure, and a parallel inference algorithm for faster cell content inference. The superiority is demonstrated both visually and statistically on two large public datasets.
翻译:从海量文档中提取并利用多样化知识是智能信息检索领域亟待解决的挑战。文档包含需要不同识别方法的多种元素。表格识别通常包含三个子任务:表格结构识别、单元格位置识别和单元格内容识别。现有模型通过结合多任务学习、局部注意力机制和相互学习策略已取得优异识别效果,但其有效性尚未得到充分解释,且推理时间较长。本文提出一种新颖的多任务模型,该模型利用非因果注意力机制捕捉完整表格结构,并采用并行推理算法加速单元格内容推断。在两个大型公开数据集上的可视化与统计结果均证明了该方法的优越性。