隐私规范化:语言模式中共同的隐私-公用事业优化 (Privacy Regularization: Joint Privacy-Utility Optimization in Language Models) - 专知论文

会员服务 ·

0

语言模型化 · 优化器 · MoDELS · 正则化项 · Extensibility ·

2021 年 4 月 16 日

Privacy Regularization: Joint Privacy-Utility Optimization in Language Models

翻译：隐私规范化:语言模式中共同的隐私-公用事业优化

Fatemehsadat Mireshghallah,Huseyin A. Inan,Marcello Hasegawa,Victor Rühle,Taylor Berg-Kirkpatrick,Robert Sim

from arxiv, NAACL-HLT 2021 Paper

Neural language models are known to have a high capacity for memorization of training samples. This may have serious privacy implications when training models on user content such as email correspondence. Differential privacy (DP), a popular choice to train models with privacy guarantees, comes with significant costs in terms of utility degradation and disparate impact on subgroups of users. In this work, we introduce two privacy-preserving regularization methods for training language models that enable joint optimization of utility and privacy through (1) the use of a discriminator and (2) the inclusion of a triplet-loss term. We compare our methods with DP through extensive evaluation. We show the advantages of our regularizers with favorable utility-privacy trade-off, faster training with the ability to tap into existing optimization approaches, and ensuring uniform treatment of under-represented subgroups.

翻译：众所周知,神经语言模式具有高度的记忆能力,对培训样本进行记忆化处理,这在诸如电子邮件通信等用户内容培训模式时可能会对隐私产生严重影响; 差异隐私(DP),一种对具有隐私保障的模型进行培训的流行选择,在公用事业退化和对用户分组的不同影响方面成本巨大; 在这项工作中,我们为培训语言模式引入了两种保护隐私的规范化方法,以便通过(1) 使用歧视者,(2) 包括三重损失术语,联合优化使用和隐私; 我们通过广泛评估,将我们的方法与DP进行比较; 我们展示了我们的规范者的优势,包括有利的公用特权交换,更快的培训,能够利用现有的优化方法,确保代表不足的分组得到统一对待。

0

相关内容

语言模型化

语言模型化

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

14+阅读 · 2021年5月21日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

79+阅读 · 2020年12月5日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

12+阅读 · 2020年6月8日

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

19+阅读 · 2020年6月4日

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

专知会员服务

66+阅读 · 2020年4月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

91+阅读 · 2020年3月12日

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

专知会员服务

13+阅读 · 2019年11月18日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

33+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

77+阅读 · 2019年10月9日

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

专知

5+阅读 · 2020年4月28日

多任务学习(Multi-task Learning)方法总结

多任务学习(Multi-task Learning)方法总结

极市平台

6+阅读 · 2020年4月26日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models

MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models

Arxiv

0+阅读 · 2021年6月9日

Nonlinear Invariant Risk Minimization: A Causal Approach

Arxiv

0+阅读 · 2021年6月9日

PEARL: Data Synthesis via Private Embeddings and Adversarial Reconstruction Learning

Arxiv

0+阅读 · 2021年6月8日

Bridging the Gap Between Adversarial Robustness and Optimization Bias

Arxiv

0+阅读 · 2021年6月7日

Antipodes of Label Differential Privacy: PATE and ALIBI

Arxiv

0+阅读 · 2021年6月7日

Bayesian Time Varying Coefficient Model with Applications to Marketing Mix Modeling

Arxiv

0+阅读 · 2021年6月7日

Hyperparameter Optimization Is Deceiving Us, and How to Stop It

Arxiv

0+阅读 · 2021年6月3日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

【ICML2021】异质风险最小化，Heterogeneous Risk Minimization

专知会员服务

14+阅读 · 2021年5月21日

【Google】梯度下降，48页ppt

【Google】梯度下降，48页ppt

专知会员服务

79+阅读 · 2020年12月5日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

12+阅读 · 2020年6月8日

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

神经网络与形式语言综述，12页pdf，A Survey of Neural Networks and Formal Languages

专知会员服务

19+阅读 · 2020年6月4日

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

专知会员服务

66+阅读 · 2020年4月28日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

91+阅读 · 2020年3月12日

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

【AAAI Tutorials 2019】联合学习：机器学习中的用户隐私，数据安全性和机密性（Federated Learning: User Privacy, Data Security and Confidentiality in Machine Learning）

专知会员服务

13+阅读 · 2019年11月18日

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

【AAAI2020论文】隐私保留GBDT（Privacy-Preserving Gradient Boosting Decision Trees）

专知会员服务

33+阅读 · 2019年11月15日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

77+阅读 · 2019年10月9日

热门VIP内容

相关资讯

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

【UCSD-MIT】深度学习隐私综述论文，Privacy in Deep Learning: A Survey

专知

5+阅读 · 2020年4月28日

多任务学习(Multi-task Learning)方法总结

多任务学习(Multi-task Learning)方法总结

极市平台

6+阅读 · 2020年4月26日

19篇ICML2019论文摘录选读！

19篇ICML2019论文摘录选读！

专知

28+阅读 · 2019年4月28日

Call for Participation: Shared Tasks in NLPCC 2019

Call for Participation: Shared Tasks in NLPCC 2019

中国计算机学会

5+阅读 · 2019年3月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

神经网络学习率设置

神经网络学习率设置

机器学习研究会

4+阅读 · 2018年3月3日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models

MACE: A Flexible Framework for Membership Privacy Estimation in Generative Models

Arxiv

0+阅读 · 2021年6月9日

Nonlinear Invariant Risk Minimization: A Causal Approach

Arxiv

0+阅读 · 2021年6月9日

PEARL: Data Synthesis via Private Embeddings and Adversarial Reconstruction Learning

Arxiv

0+阅读 · 2021年6月8日

Bridging the Gap Between Adversarial Robustness and Optimization Bias

Arxiv

0+阅读 · 2021年6月7日

Antipodes of Label Differential Privacy: PATE and ALIBI

Arxiv

0+阅读 · 2021年6月7日

Bayesian Time Varying Coefficient Model with Applications to Marketing Mix Modeling

Arxiv

0+阅读 · 2021年6月7日

Hyperparameter Optimization Is Deceiving Us, and How to Stop It

Arxiv

0+阅读 · 2021年6月3日

Privacy and Robustness in Federated Learning: Attacks and Defenses

Arxiv

35+阅读 · 2020年12月7日

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

Arxiv

5+阅读 · 2020年7月31日

Latent Relation Language Models

Arxiv

21+阅读 · 2019年8月21日

微信扫码咨询专知VIP会员