字级差异隐私权的界限 (The Limits of Word Level Differential Privacy) - 专知论文

会员服务 ·

0

Performer · 可辨认的 · 语言模型化 · 原点 · 变换 ·

2022 年 5 月 2 日

The Limits of Word Level Differential Privacy

翻译：字级差异隐私权的界限

Justus Mattern,Benjamin Weggenmann,Florian Kerschbaum

As the issues of privacy and trust are receiving increasing attention within the research community, various attempts have been made to anonymize textual data. A significant subset of these approaches incorporate differentially private mechanisms to perturb word embeddings, thus replacing individual words in a sentence. While these methods represent very important contributions, have various advantages over other techniques and do show anonymization capabilities, they have several shortcomings. In this paper, we investigate these weaknesses and demonstrate significant mathematical constraints diminishing the theoretical privacy guarantee as well as major practical shortcomings with regard to the protection against deanonymization attacks, the preservation of content of the original sentences as well as the quality of the language output. Finally, we propose a new method for text anonymization based on transformer based language models fine-tuned for paraphrasing that circumvents most of the identified weaknesses and also offers a formal privacy guarantee. We evaluate the performance of our method via thorough experimentation and demonstrate superior performance over the discussed mechanisms.

翻译：由于研究界日益重视隐私和信任问题,已作出各种努力,将文本数据匿名,其中很大一部分办法包括了干扰嵌入字词的有差别的私人机制,从而取代了单词句中的单词。这些方法代表着非常重要的贡献,与其他技术相比,具有各种优势,确实显示出匿名能力,但是它们有一些缺点。在本文件中,我们调查了这些弱点,并显示出重大的数学限制,削弱了理论上的隐私保障,以及在防止匿名攻击、保留原句内容以及语言产出质量方面存在着重大的实际缺陷。最后,我们提出了一种基于变异器语言模型的文本匿名化新方法,该变异器语言模型经过微调,以绕过大多数已查明的弱点,并提供正式的隐私保障。我们通过彻底的实验来评估我们的方法的绩效,并展示对讨论的机制的优异性表现。

0

相关内容

Performer

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Rho/ROCK信号通路介导的侵入性死亡（Entosis）在去势抵抗性前列腺癌中的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

BAG3与MACC1相互作用在甲状腺癌细胞上皮间质转化(EMT) 及侵袭中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

细胞核蛋白激酶PKM2调控结肠癌吉非替尼耐受的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

非一致指数二分与伪轨跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型多层功能化碳纳米管薄膜的制备方法及其机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

sqSGD: Locally Private and Communication Efficient Federated Learning

Arxiv

0+阅读 · 2022年6月22日

Establishing the Price of Privacy in Federated Data Trading

Arxiv

0+阅读 · 2022年6月22日

Scalable and Efficient Training of Large Convolutional Neural Networks with Differential Privacy

Arxiv

0+阅读 · 2022年6月22日

D-CIPHER: Discovery of Closed-form PDEs

Arxiv

0+阅读 · 2022年6月21日

Decentralized Distributed Learning with Privacy-Preserving Data Synthesis

Arxiv

0+阅读 · 2022年6月20日

A Differential Entropy Estimator for Training Neural Networks

Arxiv

0+阅读 · 2022年6月19日

Differentially Private Multi-Party Data Release for Linear Regression

Arxiv

0+阅读 · 2022年6月18日

On the Complexity of Two-Party Differential Privacy

Arxiv

0+阅读 · 2022年6月17日

Scalable Differentially Private Clustering via Hierarchically Separated Trees

Arxiv

0+阅读 · 2022年6月17日

Using Transfer Learning for Code-Related Tasks

Arxiv

0+阅读 · 2022年6月17日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

ICLR 2022杰出论文公布：7篇论文获得，清华朱军课题组摘得

专知会员服务

60+阅读 · 2022年4月22日

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

【北京大学】Locally Differentially Private (Contextual) Bandits Learning

专知会员服务

13+阅读 · 2020年6月8日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

【加州大学伯克利分校博士论文】通过自我监督预测学习泛化

专知会员服务

65+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《物联网（IoT）中的无人机通信高效控制》135页

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

中程单向攻击无人机的战略意义：俄乌战争启示

《面向无人机集群的避障动态传感器覆盖算法》最新38页

相关资讯

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium9

中国图象图形学学会CSIG

0+阅读 · 2021年12月17日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium8

中国图象图形学学会CSIG

0+阅读 · 2021年11月16日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium7

中国图象图形学学会CSIG

0+阅读 · 2021年11月15日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium6

中国图象图形学学会CSIG

2+阅读 · 2021年11月12日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium5

中国图象图形学学会CSIG

1+阅读 · 2021年11月11日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

sqSGD: Locally Private and Communication Efficient Federated Learning

Arxiv

0+阅读 · 2022年6月22日

Establishing the Price of Privacy in Federated Data Trading

Arxiv

0+阅读 · 2022年6月22日

Scalable and Efficient Training of Large Convolutional Neural Networks with Differential Privacy

Arxiv

0+阅读 · 2022年6月22日

D-CIPHER: Discovery of Closed-form PDEs

Arxiv

0+阅读 · 2022年6月21日

Decentralized Distributed Learning with Privacy-Preserving Data Synthesis

Arxiv

0+阅读 · 2022年6月20日

A Differential Entropy Estimator for Training Neural Networks

Arxiv

0+阅读 · 2022年6月19日

Differentially Private Multi-Party Data Release for Linear Regression

Arxiv

0+阅读 · 2022年6月18日

On the Complexity of Two-Party Differential Privacy

Arxiv

0+阅读 · 2022年6月17日

Scalable Differentially Private Clustering via Hierarchically Separated Trees

Arxiv

0+阅读 · 2022年6月17日

Using Transfer Learning for Code-Related Tasks

Arxiv

0+阅读 · 2022年6月17日

相关基金

SIRT1介导的Resveratrol对糖尿病视网膜病变“代谢记忆”的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Rho/ROCK信号通路介导的侵入性死亡（Entosis）在去势抵抗性前列腺癌中的作用及其机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

BAG3与MACC1相互作用在甲状腺癌细胞上皮间质转化(EMT) 及侵袭中的作用

国家自然科学基金

0+阅读 · 2013年12月31日

细胞核蛋白激酶PKM2调控结肠癌吉非替尼耐受的分子机制

国家自然科学基金

0+阅读 · 2013年12月31日

非一致指数二分与伪轨跟踪

国家自然科学基金

0+阅读 · 2013年12月31日

Degasperis-Procesi方程若干控制问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

microRNA调节肿瘤抑制因子Caliban应答DNA损伤的机制

国家自然科学基金

1+阅读 · 2012年12月31日

退化k-Hessian方程解的正则性研究

国家自然科学基金

0+阅读 · 2011年12月31日

新型多层功能化碳纳米管薄膜的制备方法及其机理研究

国家自然科学基金

0+阅读 · 2011年12月31日

编码密码学中若干组合对象研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员