ER-TEST:评估《国家劳工计划》模型解释规范化方法 (ER-TEST: Evaluating Explanation Regularization Methods for NLP Models) - 专知论文

会员服务 ·

0

Performer · 泛化理论 · 正则化项 · International Conference on Conceptual Modeling · 训练实例 ·

2022 年 5 月 25 日

ER-TEST: Evaluating Explanation Regularization Methods for NLP Models

翻译：ER-TEST:评估《国家劳工计划》模型解释规范化方法

Brihi Joshi,Aaron Chan,Ziyi Liu,Shaoliang Nie,Maziar Sanjabi,Hamed Firooz,Xiang Ren

from arxiv, 19 pages, 10 figures

Neural language models' (NLMs') reasoning processes are notoriously hard to explain. Recently, there has been much progress in automatically generating machine rationales of NLM behavior, but less in utilizing the rationales to improve NLM behavior. For the latter, explanation regularization (ER) aims to improve NLM generalization by pushing the machine rationales to align with human rationales. Whereas prior works primarily evaluate such ER models via in-distribution (ID) generalization, ER's impact on out-of-distribution (OOD) is largely underexplored. Plus, little is understood about how ER model performance is affected by the choice of ER criteria or by the number/choice of training instances with human rationales. In light of this, we propose ER-TEST, a protocol for evaluating ER models' OOD generalization along three dimensions: (1) unseen datasets, (2) contrast set tests, and (3) functional tests. Using ER-TEST, we study three key questions: (A) Which ER criteria are most effective for the given OOD setting? (B) How is ER affected by the number/choice of training instances with human rationales? (C) Is ER effective with distantly supervised human rationales? ER-TEST enables comprehensive analysis of these questions by considering a diverse range of tasks and datasets. Through ER-TEST, we show that ER has little impact on ID performance, but can yield large gains on OOD performance w.r.t. (1)-(3). Also, we find that the best ER criterion is task-dependent, while ER can improve OOD performance even with limited and distantly-supervised human rationales.

翻译：神经语言模型( NLM ” ) 推理过程是众所周知的难以解释的。最近,在自动生成机器解释 NLM 行为的理由方面取得了很大进展,但在利用理由来改进 NLM 行为方面进展不大。对于后者,解释正规化(ER) 的目的是通过推动机器推理来改进NLM 的概括化。虽然先前的工作主要是通过分布(ID) 概括化来评估这种ER 模型,但ER对分配(OOOD)的影响在很大程度上没有得到充分探讨。此外,对于ER模型的性能如何受到选择ER标准的影响,或者使用数量/选择来改进NLM 行为。对于后者,我们提出了ER-TE, 评估ER 模型OOO 常识化的程序有三个方面:(1) 隐秘的数据集,(2) 对比测试,(3) 功能测试。我们利用ER-TET 研究三个关键问题:(A) 哪种ER 标准对给定的OD设置最为有效,但效果如何? (B) 如何影响ER-OEST 的精确化数据分析过程?

0

相关内容

Performer

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

长链非编码RNA HOXD-AS1促进人肝细胞癌增殖的作用及分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

PPARγ激动剂增敏ABT-263抗肝细胞癌的作用及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

绝缘体上锗(GOI)纳米带应变调控机理及其MOSFET研究

国家自然科学基金

0+阅读 · 2014年12月31日

八声杜鹃与长尾缝叶莺的协同进化研究

国家自然科学基金

0+阅读 · 2014年12月31日

交叉分类累加方法与合并方法的多层统计模型理论及其应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

聚乳酸-环糊精包合物的可控制备、结晶行为及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

群代数的双曲模判别及应用

国家自然科学基金

0+阅读 · 2011年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

CaCu3Ti4O12基微/纳米陶瓷的制备与介电性能调控

国家自然科学基金

0+阅读 · 2008年12月31日

How Much More Data Do I Need? Estimating Requirements for Downstream Tasks

Arxiv

0+阅读 · 2022年7月13日

Regularization of Limited Memory Quasi-Newton Methods for Large-Scale Nonconvex Minimization

Arxiv

0+阅读 · 2022年7月11日

FIB: A Method for Evaluation of Feature Impact Balance in Multi-Dimensional Data

Arxiv

0+阅读 · 2022年7月10日

A novel evaluation methodology for supervised Feature Ranking algorithms

Arxiv

0+阅读 · 2022年7月9日

Feature Selection Methods for Uplift Modeling and Heterogeneous Treatment Effect

Feature Selection Methods for Uplift Modeling and Heterogeneous Treatment Effect

Arxiv

1+阅读 · 2022年7月8日

Evaluating Causal Inference Methods

Arxiv

0+阅读 · 2022年7月7日

Trust in Human-AI Interaction: Scoping Out Models, Measures, and Methods

Arxiv

22+阅读 · 2022年4月30日

ExSum: From Local Explanations to Model Understanding

Arxiv

13+阅读 · 2022年4月30日

Causality and Generalizability: Identifiability and Learning Methods

Arxiv

12+阅读 · 2021年10月4日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

VIP会员

文章信息

相关主题

International Conference on Conceptual Modeling

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

【USC-Aaron Chan博士答辩Slides】可信自然语言处理机器解释的生成与利用, 242页ppt，Generating and Utilizing Machine Explanations for Trustworthy NLP

专知会员服务

16+阅读 · 2022年3月13日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

从社会学实验到行为仿真：理解基于Agent的观点动力学建模思维

中英文版《GPT-5 System Card速览》报告

ACL 2025 | 大模型结构化知识提示的泛化能力研究

【普林斯顿博士论文】大型模型的高效推理

相关资讯

VCIP 2022 Call for Special Session Proposals

VCIP 2022 Call for Special Session Proposals

CCF多媒体专委会

1+阅读 · 2022年4月1日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium2

中国图象图形学学会CSIG

0+阅读 · 2021年11月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

How Much More Data Do I Need? Estimating Requirements for Downstream Tasks

Arxiv

0+阅读 · 2022年7月13日

Regularization of Limited Memory Quasi-Newton Methods for Large-Scale Nonconvex Minimization

Arxiv

0+阅读 · 2022年7月11日

FIB: A Method for Evaluation of Feature Impact Balance in Multi-Dimensional Data

Arxiv

0+阅读 · 2022年7月10日

A novel evaluation methodology for supervised Feature Ranking algorithms

Arxiv

0+阅读 · 2022年7月9日

Feature Selection Methods for Uplift Modeling and Heterogeneous Treatment Effect

Feature Selection Methods for Uplift Modeling and Heterogeneous Treatment Effect

Arxiv

1+阅读 · 2022年7月8日

Evaluating Causal Inference Methods

Arxiv

0+阅读 · 2022年7月7日

Trust in Human-AI Interaction: Scoping Out Models, Measures, and Methods

Arxiv

22+阅读 · 2022年4月30日

ExSum: From Local Explanations to Model Understanding

Arxiv

13+阅读 · 2022年4月30日

Causality and Generalizability: Identifiability and Learning Methods

Arxiv

12+阅读 · 2021年10月4日

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations

Arxiv

15+阅读 · 2020年12月3日

相关基金

长链非编码RNA HOXD-AS1促进人肝细胞癌增殖的作用及分子机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

PPARγ激动剂增敏ABT-263抗肝细胞癌的作用及分子机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

绝缘体上锗(GOI)纳米带应变调控机理及其MOSFET研究

国家自然科学基金

0+阅读 · 2014年12月31日

八声杜鹃与长尾缝叶莺的协同进化研究

国家自然科学基金

0+阅读 · 2014年12月31日

交叉分类累加方法与合并方法的多层统计模型理论及其应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

Eulerian bond-cubic 模型渗流性质的数值研究

国家自然科学基金

0+阅读 · 2012年12月31日

聚乳酸-环糊精包合物的可控制备、结晶行为及性能研究

国家自然科学基金

0+阅读 · 2012年12月31日

群代数的双曲模判别及应用

国家自然科学基金

0+阅读 · 2011年12月31日

广义Fermat猜想与相关的丢番图方程

国家自然科学基金

1+阅读 · 2009年12月31日

CaCu3Ti4O12基微/纳米陶瓷的制备与介电性能调控

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员