自然资源语言推断系统对反分析 (Adversarial Analysis of Natural Language Inference Systems) - 专知论文

会员服务 ·

0

Performer · MoDELS · 模型性能 · 数据集 · 推断 ·

2019 年 12 月 7 日

Adversarial Analysis of Natural Language Inference Systems

翻译：自然资源语言推断系统对反分析

Tiffany Chien,Jugal Kalita

from arxiv, 8 pages, accepted by IEEE ICSC 2020

The release of large natural language inference (NLI) datasets like SNLI and MNLI have led to rapid development and improvement of completely neural systems for the task. Most recently, heavily pre-trained, Transformer-based models like BERT and MT-DNN have reached near-human performance on these datasets. However, these standard datasets have been shown to contain many annotation artifacts, allowing models to shortcut understanding using simple fallible heuristics, and still perform well on the test set. So it is no surprise that many adversarial (challenge) datasets have been created that cause models trained on standard datasets to fail dramatically. Although extra training on this data generally improves model performance on just that type of data, transferring that learning to unseen examples is still partial at best. This work evaluates the failures of state-of-the-art models on existing adversarial datasets that test different linguistic phenomena, and find that even though the models perform similarly on MNLI, they differ greatly in their robustness to these attacks. In particular, we find syntax-related attacks to be particularly effective across all models, so we provide a fine-grained analysis and comparison of model performance on those examples. We draw conclusions about the value of model size and multi-task learning (beyond comparing their standard test set performance), and provide suggestions for more effective training data.

翻译：最近,如BERT和MT-DNNN等经过大量事先培训的、基于变异的模型已经在这些数据集上达到接近人的性能。然而,这些标准数据集显示含有许多注释性人工制品,使模型能够利用简单易理解的超自然理论和MNLI(NNLI)等大型自然语言推断数据集来绕过理解,并且仍然在测试集中表现良好。因此,许多对抗性(对称)数据集的创建导致标准数据集培训模型的失败并不奇怪。尽管关于这些数据的额外培训通常能改善仅仅这类数据的模型性能,但将这种学习转移到无形实例的最好不过部分。这项工作评估了测试不同语言现象的现有对抗性数据集中的最新性模型的失败,发现即使模型在测试不同语言现象时表现相似,它们与这些攻击的强健性有很大差异。我们发现,与这些标准性能相比性能相比,我们发现与这些模型相关的标准性能分析模型和多级性能分析非常精确,我们发现这些与性能比较模型相关的模型和多级性能分析模型,我们发现这些测试性能的模型和多级性能。

0

相关内容

Performer

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

11+阅读 · 2020年2月23日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

42+阅读 · 2020年1月28日

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

专知会员服务

41+阅读 · 2019年11月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

54+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

79+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

52+阅读 · 2019年9月29日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

人工智能 | SCI期刊专刊/国际会议信息7条

人工智能 | SCI期刊专刊/国际会议信息7条

Call4Papers

7+阅读 · 2019年3月12日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Adversarial NLI: A New Benchmark for Natural Language Understanding

Arxiv

4+阅读 · 2019年10月31日

A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis

A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis

Arxiv

6+阅读 · 2019年10月21日

Semantics-aware BERT for Language Understanding

Arxiv

4+阅读 · 2019年9月5日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

Dialogue Natural Language Inference

Arxiv

6+阅读 · 2018年11月1日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

Arxiv

7+阅读 · 2018年6月12日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Baselines and test data for cross-lingual inference

Arxiv

3+阅读 · 2018年3月2日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

VIP会员

文章信息

相关主题

相关VIP内容

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

【微软雷德蒙研究院】对抗机器学习工业视角，Adversarial Machine Learning - Industry Perspectives

专知会员服务

11+阅读 · 2020年2月23日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

42+阅读 · 2020年1月28日

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

【CCL 2019】ATT-第19期：文本生成 |Text Generation: From the Perspective of Interactive Inference （张家俊）

专知会员服务

41+阅读 · 2019年11月12日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

54+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

79+阅读 · 2019年10月9日

最新BERT相关论文清单，BERT-related Papers

最新BERT相关论文清单，BERT-related Papers

专知会员服务

52+阅读 · 2019年9月29日

热门VIP内容

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

人工智能 | SCI期刊专刊/国际会议信息7条

人工智能 | SCI期刊专刊/国际会议信息7条

Call4Papers

7+阅读 · 2019年3月12日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

笔记 | Sentiment Analysis

笔记 | Sentiment Analysis

黑龙江大学自然语言处理实验室

10+阅读 · 2018年5月6日

人工智能 | 国际会议截稿信息9条

人工智能 | 国际会议截稿信息9条

Call4Papers

4+阅读 · 2018年3月13日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

人工智能 | 国际会议/SCI期刊约稿信息9条

人工智能 | 国际会议/SCI期刊约稿信息9条

Call4Papers

3+阅读 · 2018年1月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Adversarial NLI: A New Benchmark for Natural Language Understanding

Arxiv

4+阅读 · 2019年10月31日

A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis

A Survey and Taxonomy of Adversarial Neural Networks for Text-to-Image Synthesis

Arxiv

6+阅读 · 2019年10月21日

Semantics-aware BERT for Language Understanding

Arxiv

4+阅读 · 2019年9月5日

Commonsense Reasoning for Natural Language Understanding: A Survey of Benchmarks, Resources, and Approaches

Arxiv

16+阅读 · 2019年4月2日

Dialogue Natural Language Inference

Arxiv

6+阅读 · 2018年11月1日

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Arxiv

8+阅读 · 2018年7月26日

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

Arxiv

7+阅读 · 2018年6月12日

An Improved Evaluation Framework for Generative Adversarial Networks

Arxiv

3+阅读 · 2018年3月27日

Baselines and test data for cross-lingual inference

Arxiv

3+阅读 · 2018年3月2日

Distance-based Self-Attention Network for Natural Language Inference

Arxiv

10+阅读 · 2017年12月6日

微信扫码咨询专知VIP会员