分析和改进噪音 -- -- 声音 -- -- 关心 -- -- 估计的优化景观 (Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation) - 专知论文

会员服务 ·

0

噪声分布 · 优化地形 · 估计/估计量 · 噪声 · 统计量 ·

2021 年 10 月 21 日

Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation

翻译：分析和改进噪音 -- -- 声音 -- -- 关心 -- -- 估计的优化景观

Bingbin Liu,Elan Rosenfeld,Pradeep Ravikumar,Andrej Risteski

Noise-contrastive estimation (NCE) is a statistically consistent method for learning unnormalized probabilistic models. It has been empirically observed that the choice of the noise distribution is crucial for NCE's performance. However, such observations have never been made formal or quantitative. In fact, it is not even clear whether the difficulties arising from a poorly chosen noise distribution are statistical or algorithmic in nature. In this work, we formally pinpoint reasons for NCE's poor performance when an inappropriate noise distribution is used. Namely, we prove these challenges arise due to an ill-behaved (more precisely, flat) loss landscape. To address this, we introduce a variant of NCE called "eNCE" which uses an exponential loss and for which normalized gradient descent addresses the landscape issues provably when the target and noise distributions are in a given exponential family.

翻译：在统计学上,噪声-扰动估计(NCE)是一种一致的方法,用于学习不规范的概率模型。从经验上看,选择噪音分布对于NCE的表现至关重要。然而,这种观察从未正式或量化。事实上,甚至还不清楚选择不当的噪音分布引起的困难是否属于统计或算法性质。在这项工作中,我们正式确定在使用不适当的噪音分布时NCE表现不佳的原因。也就是说,我们证明这些挑战是由于行为不良(更确切地说,是平坦的)损失环境造成的。为了解决这个问题,我们引入了称为“ENCE”的NCE变种,即使用指数损失,在目标与噪音分布在一个特定指数式家庭时,平坦度梯度梯度下降能够解决地貌问题。

0

相关内容

噪声分布

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

将门创投

3+阅读 · 2018年6月20日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability

Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability

Arxiv

0+阅读 · 2021年12月16日

Geometric Transformer for End-to-End Molecule Properties Prediction

Arxiv

0+阅读 · 2021年12月16日

On variance estimation for the one-sample log-rank test

Arxiv

0+阅读 · 2021年12月16日

On the Convergence and Robustness of Adversarial Training

Arxiv

0+阅读 · 2021年12月15日

Pretrained Transformers Improve Out-of-Distribution Robustness

Arxiv

5+阅读 · 2020年4月13日

On the Implicit Assumptions of GANs

Arxiv

6+阅读 · 2018年11月29日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

The challenge of simultaneous object detection and pose estimation: a comparative study

Arxiv

6+阅读 · 2018年1月24日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

【MIT】自监督几何感知，22页ppt，Self-supervised Geometric Perception

专知会员服务

23+阅读 · 2021年6月3日

【干货书】鲁棒优化Robust Optimization，570页pdf

专知会员服务

144+阅读 · 2021年3月17日

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

Fariz Darari简明《博弈论Game Theory》介绍，35页ppt

专知会员服务

111+阅读 · 2020年5月15日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

Risk Sensitive Portfolio Optimization with Regime-Switching and Default Contagion，香港理工大学应用数学系余翔助理教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

10+阅读 · 2019年10月24日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《美空军条令出版物：战略打击》最新条令

《高能激光武器》22页slides

军事前沿模型

《面向小型无人机或无人飞行器的创新雷达探测与人工智能分类技术》263页

相关资讯

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

已删除

将门创投

3+阅读 · 2018年6月20日

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

ResNet, AlexNet, VGG, Inception：各种卷积网络架构的理解

全球人工智能

20+阅读 · 2017年12月17日

计算机视觉近一年进展综述

计算机视觉近一年进展综述

机器学习研究会

9+阅读 · 2017年11月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

【推荐】GAN架构入门综述(资源汇总)

【推荐】GAN架构入门综述(资源汇总)

机器学习研究会

10+阅读 · 2017年9月3日

相关论文

Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability

Pushing the Limits of Rule Reasoning in Transformers through Natural Language Satisfiability

Arxiv

0+阅读 · 2021年12月16日

Geometric Transformer for End-to-End Molecule Properties Prediction

Arxiv

0+阅读 · 2021年12月16日

On variance estimation for the one-sample log-rank test

Arxiv

0+阅读 · 2021年12月16日

On the Convergence and Robustness of Adversarial Training

Arxiv

0+阅读 · 2021年12月15日

Pretrained Transformers Improve Out-of-Distribution Robustness

Arxiv

5+阅读 · 2020年4月13日

On the Implicit Assumptions of GANs

Arxiv

6+阅读 · 2018年11月29日

Implicit Maximum Likelihood Estimation

Implicit Maximum Likelihood Estimation

Arxiv

7+阅读 · 2018年9月24日

Large-Scale Stochastic Sampling from the Probability Simplex

Arxiv

3+阅读 · 2018年6月19日

The challenge of simultaneous object detection and pose estimation: a comparative study

Arxiv

6+阅读 · 2018年1月24日

Being Robust (in High Dimensions) Can Be Practical

Arxiv

3+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员