争取在现实世界中大力提取视觉信息:新数据集和新解决方案 (Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution) - 专知论文

会员服务 ·

0

INFORMS · 信息抽取 · 稳健性 · 数据集 · CLUES ·

2021 年 1 月 24 日

Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

翻译：争取在现实世界中大力提取视觉信息:新数据集和新解决方案

Jiapeng Wang,Chongyu Liu,Lianwen Jin,Guozhi Tang,Jiaxin Zhang,Shuaitao Zhang,Qianying Wang,Yaqiang Wu,Mingxiang Cai

from arxiv, 8 pages, 5 figures, to be published in AAAI 2021

Visual information extraction (VIE) has attracted considerable attention recently owing to its various advanced applications such as document understanding, automatic marking and intelligent education. Most existing works decoupled this problem into several independent sub-tasks of text spotting (text detection and recognition) and information extraction, which completely ignored the high correlation among them during optimization. In this paper, we propose a robust visual information extraction system (VIES) towards real-world scenarios, which is a unified end-to-end trainable framework for simultaneous text detection, recognition and information extraction by taking a single document image as input and outputting the structured information. Specifically, the information extraction branch collects abundant visual and semantic representations from text spotting for multimodal feature fusion and conversely, provides higher-level semantic clues to contribute to the optimization of text spotting. Moreover, regarding the shortage of public benchmarks, we construct a fully-annotated dataset called EPHOIE (https://github.com/HCIILAB/EPHOIE), which is the first Chinese benchmark for both text spotting and visual information extraction. EPHOIE consists of 1,494 images of examination paper head with complex layouts and background, including a total of 15,771 Chinese handwritten or printed text instances. Compared with the state-of-the-art methods, our VIES shows significant superior performance on the EPHOIE dataset and achieves a 9.01% F-score gain on the widely used SROIE dataset under the end-to-end scenario.

翻译：最近,由于文件理解、自动标识和智能教育等各种先进应用,视觉信息提取(VIE)最近引起了相当的关注。大多数现有作品将这一问题分解成数个独立的文本检测(文本检测和识别)和信息提取子任务,在优化过程中完全忽视了它们之间的高度相关性。在本文件中,我们建议为现实世界情景建立一个强大的视觉信息提取系统(VIES),这是用于同时检测、识别和信息提取的统一端到端的可培训框架,其方法是将单一的文件图像作为输入和输出结构化信息。具体地说,信息提取分支收集了从多式特征聚合和反向的文本检测(文本检测和识别)和信息提取的多个独立的子任务。此外,关于公共基准的短缺,我们建了一个称为EPHOIE的全称附加说明的数据集(https://github.com/HCIILAB/EPHOIE),这是中国文本定位和视觉信息提取的首个基准。EPIEOIEIE(包括1 494的高级图像),在中国纸面图纸面图纸面图中,包括了我们使用的纸面图纸面图的印刷图。

10

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【AAAI2021】面向真实世界的鲁棒视觉信息提取:新的数据集和新颖的解决方案

【AAAI2021】面向真实世界的鲁棒视觉信息提取:新的数据集和新颖的解决方案

专知会员服务

21+阅读 · 2021年2月17日

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

76+阅读 · 2021年1月30日

【AAAI2021】信息瓶颈和有监督表征解耦

【AAAI2021】信息瓶颈和有监督表征解耦

专知会员服务

20+阅读 · 2021年1月27日

【KDD2020】图模型信息融合

专知会员服务

36+阅读 · 2020年10月15日

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

58+阅读 · 2020年6月30日

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

专知会员服务

16+阅读 · 2020年5月19日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

37+阅读 · 2020年4月8日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

专知会员服务

15+阅读 · 2019年8月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

计算机 | 中低难度国际会议信息6条

计算机 | 中低难度国际会议信息6条

Call4Papers

7+阅读 · 2019年5月16日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

14+阅读 · 2019年4月13日

论文浅尝 | Global Relation Embedding for Relation Extraction

论文浅尝 | Global Relation Embedding for Relation Extraction

开放知识图谱

12+阅读 · 2019年3月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

A Question-answering Based Framework for Relation Extraction Validation

Arxiv

0+阅读 · 2021年4月7日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering

Arxiv

3+阅读 · 2019年5月10日

Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation

Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation

Arxiv

5+阅读 · 2018年9月24日

Rapid Customization for Event Extraction

Rapid Customization for Event Extraction

Arxiv

7+阅读 · 2018年9月20日

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering

Arxiv

7+阅读 · 2018年5月24日

QA4IE: A Question Answering based Framework for Information Extraction

Arxiv

4+阅读 · 2018年4月10日

Long-Term Visual Object Tracking Benchmark

Arxiv

3+阅读 · 2018年3月22日

Saliency-Enhanced Robust Visual Tracking

Arxiv

6+阅读 · 2018年2月8日

Detecting Curve Text in the Wild: New Dataset and New Solution

Arxiv

4+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

相关VIP内容

【AAAI2021】面向真实世界的鲁棒视觉信息提取:新的数据集和新颖的解决方案

【AAAI2021】面向真实世界的鲁棒视觉信息提取:新的数据集和新颖的解决方案

专知会员服务

21+阅读 · 2021年2月17日

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

76+阅读 · 2021年1月30日

【AAAI2021】信息瓶颈和有监督表征解耦

【AAAI2021】信息瓶颈和有监督表征解耦

专知会员服务

20+阅读 · 2021年1月27日

【KDD2020】图模型信息融合

专知会员服务

36+阅读 · 2020年10月15日

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

【IJCAJ 2019】多视角知识图谱嵌入的实体对齐，Multi-view Knowledge Graph Embedding for Entity Alignment

专知会员服务

58+阅读 · 2020年6月30日

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

【ACL2020】Span-ConveRT：预训练对话表示小样本跨度提取，Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations

专知会员服务

16+阅读 · 2020年5月19日

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

【清华大学-腾讯】关系提取综述，Review and Outlook for Relation Extraction

专知会员服务

37+阅读 · 2020年4月8日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

53+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

【IJCAI 2019 | tutorial】文本生成中的艺术字 Creative and Artistic Writing via Text Generation，北京大学|严睿

专知会员服务

15+阅读 · 2019年8月12日

热门VIP内容

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

计算机 | 中低难度国际会议信息6条

计算机 | 中低难度国际会议信息6条

Call4Papers

7+阅读 · 2019年5月16日

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

无人机视觉挑战赛 | ICCV 2019 Workshop—VisDrone2019

PaperWeekly

7+阅读 · 2019年5月5日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

14+阅读 · 2019年4月13日

论文浅尝 | Global Relation Embedding for Relation Extraction

论文浅尝 | Global Relation Embedding for Relation Extraction

开放知识图谱

12+阅读 · 2019年3月3日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

【跟踪Tracking】15篇论文+代码 | 中秋快乐~

专知

18+阅读 · 2018年9月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

【泡泡一分钟】基于视觉传感器的三维空间几何重建（3dv-16）

泡泡机器人SLAM

4+阅读 · 2017年12月18日

相关论文

A Question-answering Based Framework for Relation Extraction Validation

Arxiv

0+阅读 · 2021年4月7日

Deep Neural Network Based Relation Extraction: An Overview

Arxiv

14+阅读 · 2021年1月6日

GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering

Arxiv

3+阅读 · 2019年5月10日

Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation

Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation

Arxiv

5+阅读 · 2018年9月24日

Rapid Customization for Event Extraction

Rapid Customization for Event Extraction

Arxiv

7+阅读 · 2018年9月20日

R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering

Arxiv

7+阅读 · 2018年5月24日

QA4IE: A Question Answering based Framework for Information Extraction

Arxiv

4+阅读 · 2018年4月10日

Long-Term Visual Object Tracking Benchmark

Arxiv

3+阅读 · 2018年3月22日

Saliency-Enhanced Robust Visual Tracking

Arxiv

6+阅读 · 2018年2月8日

Detecting Curve Text in the Wild: New Dataset and New Solution

Arxiv

4+阅读 · 2017年12月6日

微信扫码咨询专知VIP会员