有效场景文字识别平行比例化关注网络 (Parallel Scale-wise Attention Network for Effective Scene Text Recognition)

The paper proposes a new text recognition network for scene-text images. Many state-of-the-art methods employ the attention mechanism either in the text encoder or decoder for the text alignment. Although the encoder-based attention yields promising results, these schemes inherit noticeable limitations. They perform the feature extraction (FE) and visual attention (VA) sequentially, which bounds the attention mechanism to rely only on the FE final single-scale output. Moreover, the utilization of the attention process is limited by only applying it directly to the single scale feature-maps. To address these issues, we propose a new multi-scale and encoder-based attention network for text recognition that performs the multi-scale FE and VA in parallel. The multi-scale channels also undergo regular fusion with each other to develop the coordinated knowledge together. Quantitative evaluation and robustness analysis on the standard benchmarks demonstrate that the proposed network outperforms the state-of-the-art in most cases.

翻译：本文为现场文字图像建议了新的文本识别网络。许多最先进的方法在文本编码器或编码器中采用关注机制来对文本进行校正。虽然以编码器为基础的关注产生了有希望的结果,但这些计划具有明显的局限性。它们依次执行特征提取(FE)和视觉关注(VA),使关注机制只能依赖FE最终的单一尺度产出。此外,对关注过程的利用受到限制,仅将关注过程直接应用到单一规模的特征图案中。为了解决这些问题,我们建议建立一个新的多尺度和编码器的注意网络,以同时进行多尺度FE和VA的文本识别。多尺度的渠道也经常相互融合,以共同开发协调的知识。对标准基准进行定量评估和强力分析表明,拟议的网络在大多数情况下都超越了最新技术。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

“CVPR 2021 接受论文列表 1663篇论文都在这了

专知会员服务

32+阅读 · 2021年6月12日

基于上下文化图注意力网络的知识图谱的条目推荐，Contextualized Graph Attention Network for Recommendation with Item Knowledge Graph

专知会员服务

101+阅读 · 2020年6月28日

【文献综述】Text Detection and Recognition in the Wild: A Review 自然文本检测与识别

专知会员服务

46+阅读 · 2020年6月11日

【Manning新书】现代Java实战，592页pdf

专知会员服务

101+阅读 · 2020年5月22日