LM-Crictic: 未经监督的文法错误校正语言模型 (LM-Critic: Language Models for Unsupervised Grammatical Error Correction) - 专知论文

会员服务 ·

0

语言模型化 · MoDELS · 无监督 · 情景 · 自助法/自举法 ·

2021 年 9 月 14 日

LM-Critic: Language Models for Unsupervised Grammatical Error Correction

翻译：LM-Crictic: 未经监督的文法错误校正语言模型

Michihiro Yasunaga,Jure Leskovec,Percy Liang

from arxiv, EMNLP 2021. Code & data available at https://github.com/michiyasunaga/LM-Critic

Training a model for grammatical error correction (GEC) requires a set of labeled ungrammatical / grammatical sentence pairs, but manually annotating such pairs can be expensive. Recently, the Break-It-Fix-It (BIFI) framework has demonstrated strong results on learning to repair a broken program without any labeled examples, but this relies on a perfect critic (e.g., a compiler) that returns whether an example is valid or not, which does not exist for the GEC task. In this work, we show how to leverage a pretrained language model (LM) in defining an LM-Critic, which judges a sentence to be grammatical if the LM assigns it a higher probability than its local perturbations. We apply this LM-Critic and BIFI along with a large set of unlabeled sentences to bootstrap realistic ungrammatical / grammatical pairs for training a corrector. We evaluate our approach on GEC datasets across multiple domains (CoNLL-2014, BEA-2019, GMEG-wiki and GMEG-yahoo) and show that it outperforms existing methods in both the unsupervised setting (+7.7 F0.5) and the supervised setting (+0.5 F0.5).

翻译：用于校正语法错误校正的模型( GEC ) 需要一套标记的未语法/ 语法句配对的非语法/ 语法句配对, 但手动批注这种配对可能很昂贵。最近, 突破 lt- Fix- lt (BIFI) 框架( Break- It- Fix- It (BIFI) 框架( BIFI) 在学习修补破碎的程序方面展示了强大的成果, 但没有贴标签的例子, 但是这依赖于一个完美的批评者( 例如, 编译者), 该评论者返回一个范例是否有效, 而对于 GEC 的任务来说, 并不存在。在这项工作中, 我们展示了如何利用预先训练的语言模式来定义 LM- Critic, 如果 LM 指派的概率高于本地扰动率, 则判断一个语法则语法系化的语法系。我们应用这个语法系和无标签的语系组合, 将现有GEGF7+25 ( GEG+G) 设置方法。

0

相关内容

语言模型化

语言模型化

金融人工智能，40页pdf

金融人工智能，40页pdf

专知会员服务

147+阅读 · 2021年10月9日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

18+阅读 · 2020年11月17日

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

专知会员服务

78+阅读 · 2020年8月13日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

专知会员服务

67+阅读 · 2019年12月27日

【课程推荐】斯坦福课程：图机器学习《CS224W: Machine Learning with Graphs(Stanford / Fall 2019)》by Jurij Leskovec

【课程推荐】斯坦福课程：图机器学习《CS224W: Machine Learning with Graphs(Stanford / Fall 2019)》by Jurij Leskovec

专知会员服务

146+阅读 · 2019年12月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

弱监督语义分割最新方法资源列表

弱监督语义分割最新方法资源列表

专知

9+阅读 · 2019年2月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

专知

24+阅读 · 2018年3月31日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Unsupervised and Distributional Detection of Machine-Generated Text

Unsupervised and Distributional Detection of Machine-Generated Text

Arxiv

0+阅读 · 2021年11月4日

Lexically Aware Semi-Supervised Learning for OCR Post-Correction

Arxiv

0+阅读 · 2021年11月4日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

Unsupervised Data Augmentation for Consistency Training

Arxiv

5+阅读 · 2019年7月10日

Unsupervised Multilingual Word Embeddings

Arxiv

4+阅读 · 2018年9月6日

Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation

Arxiv

5+阅读 · 2018年4月16日

Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task

Arxiv

3+阅读 · 2018年4月16日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

Arxiv

5+阅读 · 2018年1月26日

VIP会员

文章信息

相关主题

语言模型化

自助法/自举法

相关VIP内容

金融人工智能，40页pdf

金融人工智能，40页pdf

专知会员服务

147+阅读 · 2021年10月9日

【Facebook AI】无监督机器翻译，336页ppt，Unsupervised Machine Translation

专知会员服务

18+阅读 · 2020年11月17日

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

【神经自然语言处理进展：建模，学习，推理】Progress in Neural NLP: Modeling, Learning, and Reasoning

专知会员服务

78+阅读 · 2020年8月13日

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

【三维物体和手部姿态估计】综述论文最新进展，Recent Advances in 3D Object and Hand Pose Estimation

专知会员服务

21+阅读 · 2020年6月13日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

【新书】深度学习自然语言处理，Deep Learning for Natural Language Processing

专知会员服务

67+阅读 · 2019年12月27日

【课程推荐】斯坦福课程：图机器学习《CS224W: Machine Learning with Graphs(Stanford / Fall 2019)》by Jurij Leskovec

【课程推荐】斯坦福课程：图机器学习《CS224W: Machine Learning with Graphs(Stanford / Fall 2019)》by Jurij Leskovec

专知会员服务

146+阅读 · 2019年12月10日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

热门VIP内容

开通专知VIP会员享更多权益服务

《海战法：海战中的人工智能与自主系统》最新45页

《美军条令：行动后评估》2025最新36页

中文版 | 先进通信技术

《国防系统提升可靠性与维护性评估效能的实践准则》最新64页

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

弱监督语义分割最新方法资源列表

弱监督语义分割最新方法资源列表

专知

9+阅读 · 2019年2月26日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

【论文推荐】最新七篇图像分割相关论文—域适应深度表示学习、循环残差卷积、二值分割、图像合成、无监督跨模态

专知

19+阅读 · 2018年6月1日

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

【论文推荐】最新六篇情感分析相关论文—深度上下文、支持向量机、两级LSTM、多模态情感分析、软件工程、代码混合

专知

24+阅读 · 2018年3月31日

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

【推荐】ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

机器学习研究会

20+阅读 · 2017年12月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Unsupervised and Distributional Detection of Machine-Generated Text

Unsupervised and Distributional Detection of Machine-Generated Text

Arxiv

0+阅读 · 2021年11月4日

Lexically Aware Semi-Supervised Learning for OCR Post-Correction

Arxiv

0+阅读 · 2021年11月4日

Unsupervised Domain Clusters in Pretrained Language Models

Arxiv

11+阅读 · 2020年4月5日

Multi-Task Self-Supervised Learning for Disfluency Detection

Arxiv

5+阅读 · 2019年8月15日

Unsupervised Data Augmentation for Consistency Training

Arxiv

5+阅读 · 2019年7月10日

Unsupervised Multilingual Word Embeddings

Arxiv

4+阅读 · 2018年9月6日

Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation

Arxiv

5+阅读 · 2018年4月16日

Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task

Arxiv

3+阅读 · 2018年4月16日

Unsupervised Neural Machine Translation

Arxiv

6+阅读 · 2018年2月26日

A Multilayer Convolutional Encoder-Decoder Neural Network for Grammatical Error Correction

Arxiv

5+阅读 · 2018年1月26日

微信扫码咨询专知VIP会员