语言属性的因果影响 (Causal Effects of Linguistic Properties) - 专知论文

会员服务 ·

0

估计/估计量 · 可辨认的 · 语言模型化 · Performer · CASE ·

2021 年 4 月 9 日

Causal Effects of Linguistic Properties

翻译：语言属性的因果影响

Reid Pryzant,Dallas Card,Dan Jurafsky,Victor Veitch,Dhanya Sridhar

from arxiv, To appear at NAACL 2021 (Annual Conference of the North American Chapter of the Association for Computational Linguistics)

We consider the problem of using observational data to estimate the causal effects of linguistic properties. For example, does writing a complaint politely lead to a faster response time? How much will a positive product review increase sales? This paper addresses two technical challenges related to the problem before developing a practical method. First, we formalize the causal quantity of interest as the effect of a writer's intent, and establish the assumptions necessary to identify this from observational data. Second, in practice, we only have access to noisy proxies for the linguistic properties of interest -- e.g., predictions from classifiers and lexicons. We propose an estimator for this setting and prove that its bias is bounded when we perform an adjustment for the text. Based on these results, we introduce TextCause, an algorithm for estimating causal effects of linguistic properties. The method leverages (1) distant supervision to improve the quality of noisy proxies, and (2) a pre-trained language model (BERT) to adjust for the text. We show that the proposed method outperforms related approaches when estimating the effect of Amazon review sentiment on semi-simulated sales figures. Finally, we present an applied case study investigating the effects of complaint politeness on bureaucratic response times.

翻译：我们考虑的是使用观察数据来估计语言特性的因果关系的问题。例如,以礼貌方式撰写投诉是否会导致更快的反应时间?积极产品审查会增加销售多少?本文在制订实用方法之前,讨论与这一问题有关的两个技术挑战。首先,我们将利息因果数量确定为作者意图的效果,并从观察数据中确定必要的假设来查明这一点。第二,在实践中,我们只能接触有关语言特性的吵闹代理人 -- -- 例如分类员和词汇员的预测。我们为这一设置建议一个估计符,并证明在对文本进行调整时,其偏差是受约束的。根据这些结果,我们采用 " 文字原因 " 算法,用以估计语言特性的因果关系。这种方法利用(1) 遥远的监督来提高扰动剂的质量,和(2) 事先培训的语言模型(BERT)来调整文本。我们表明,在估计亚马逊审查情绪对半模拟销售数字的影响时,拟议的方法不符合相关方法。最后,我们进行了一次对官僚主义反应进行有礼貌的案例研究。

0

相关内容

估计/估计量

估计/估计量

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

112+阅读 · 2020年11月12日

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

专知会员服务

40+阅读 · 2020年5月4日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

专知会员服务

10+阅读 · 2020年1月7日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

计算机 | 中低难度国际会议信息8条

计算机 | 中低难度国际会议信息8条

Call4Papers

9+阅读 · 2019年6月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Linguistically Regularized LSTMs for Sentiment Classification

Linguistically Regularized LSTMs for Sentiment Classification

黑龙江大学自然语言处理实验室

8+阅读 · 2018年5月4日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Provably Secure Generative Linguistic Steganography

Arxiv

1+阅读 · 2021年6月3日

Sample Selection Bias in Evaluation of Prediction Performance of Causal Models

Sample Selection Bias in Evaluation of Prediction Performance of Causal Models

Arxiv

0+阅读 · 2021年6月3日

Matrix factorisation and the interpretation of geodesic distance

Matrix factorisation and the interpretation of geodesic distance

Arxiv

0+阅读 · 2021年6月2日

Retrospective causal inference via matrix completion, with an evaluation of the effect of European integration on cross-border employment

Arxiv

0+阅读 · 2021年6月1日

Post-Contextual-Bandit Inference

Arxiv

0+阅读 · 2021年6月1日

The Causal Learning of Retail Delinquency

Arxiv

15+阅读 · 2020年12月17日

Equivalent Causal Models

Arxiv

6+阅读 · 2020年12月10日

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

Arxiv

4+阅读 · 2020年3月26日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

The challenge of simultaneous object detection and pose estimation: a comparative study

Arxiv

6+阅读 · 2018年1月24日

VIP会员

文章信息

相关主题

估计/估计量

语言模型化

相关VIP内容

【2020新书】概率机器学习，附212页pdf与slides

【2020新书】概率机器学习，附212页pdf与slides

专知会员服务

112+阅读 · 2020年11月12日

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

【IJCAI2020】从语言图谱到常识图谱，TransOMCS: From Linguistic Graphs to Commonsense Knowledge

专知会员服务

40+阅读 · 2020年5月4日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

253+阅读 · 2020年4月19日

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

【论文翻译】2020最新预训练语言模型综述：Pre-trained Models for Natural Language Processing: A Survey

专知会员服务

94+阅读 · 2020年4月13日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

【ML课程】多变量微积分（Multivariable Calculus），加州大学伯克利分校| Prof. Denis Auroux

专知会员服务

10+阅读 · 2020年1月7日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

《解析陆域作战方向：一个概念性框架》报告

《人工智能与人类的未来》2025年最新300页书籍

追寻真正的AI自主性：从遗留思维到战场优势

《“蛛网”行动：乌克兰不对称作战的演进》报告

相关资讯

计算机 | 中低难度国际会议信息8条

计算机 | 中低难度国际会议信息8条

Call4Papers

9+阅读 · 2019年6月19日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

CCF C类 | IJCNN 2019 Special Section : 信息论与深度学习

Call4Papers

5+阅读 · 2018年12月7日

Linguistically Regularized LSTMs for Sentiment Classification

Linguistically Regularized LSTMs for Sentiment Classification

黑龙江大学自然语言处理实验室

8+阅读 · 2018年5月4日

计算机类 | 期刊专刊截稿信息9条

计算机类 | 期刊专刊截稿信息9条

Call4Papers

4+阅读 · 2018年1月26日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

相关论文

Provably Secure Generative Linguistic Steganography

Arxiv

1+阅读 · 2021年6月3日

Sample Selection Bias in Evaluation of Prediction Performance of Causal Models

Sample Selection Bias in Evaluation of Prediction Performance of Causal Models

Arxiv

0+阅读 · 2021年6月3日

Matrix factorisation and the interpretation of geodesic distance

Matrix factorisation and the interpretation of geodesic distance

Arxiv

0+阅读 · 2021年6月2日

Retrospective causal inference via matrix completion, with an evaluation of the effect of European integration on cross-border employment

Arxiv

0+阅读 · 2021年6月1日

Post-Contextual-Bandit Inference

Arxiv

0+阅读 · 2021年6月1日

The Causal Learning of Retail Delinquency

Arxiv

15+阅读 · 2020年12月17日

Equivalent Causal Models

Arxiv

6+阅读 · 2020年12月10日

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

Arxiv

4+阅读 · 2020年3月26日

A Survey on Causal Inference

Arxiv

112+阅读 · 2020年2月5日

The challenge of simultaneous object detection and pose estimation: a comparative study

Arxiv

6+阅读 · 2018年1月24日

微信扫码咨询专知VIP会员