结构式蒸汽梯度MCMC (Structured Stochastic Gradient MCMC) - 专知论文

会员服务 ·

0

Markov · 马尔可夫链 · 视觉识别系统 · MCMC · 泛函 ·

2022 年 7 月 18 日

Structured Stochastic Gradient MCMC

翻译：结构式蒸汽梯度MCMC

Antonios Alexos,Alex Boyd,Stephan Mandt

from arxiv, paper accepted in ICML2022. Code can be found here https://github.com/ajboyd2/pytorch_lvi

Stochastic gradient Markov Chain Monte Carlo (SGMCMC) is considered the gold standard for Bayesian inference in large-scale models, such as Bayesian neural networks. Since practitioners face speed versus accuracy tradeoffs in these models, variational inference (VI) is often the preferable option. Unfortunately, VI makes strong assumptions on both the factorization and functional form of the posterior. In this work, we propose a new non-parametric variational approximation that makes no assumptions about the approximate posterior's functional form and allows practitioners to specify the exact dependencies the algorithm should respect or break. The approach relies on a new Langevin-type algorithm that operates on a modified energy function, where parts of the latent variables are averaged over samples from earlier iterations of the Markov chain. This way, statistical dependencies can be broken in a controlled way, allowing the chain to mix faster. This scheme can be further modified in a "dropout" manner, leading to even more scalability. We test our scheme for ResNet-20 on CIFAR-10, SVHN, and FMNIST. In all cases, we find improvements in convergence speed and/or final accuracy compared to SG-MCMC and VI.

翻译：由于业者在这些模型中面临速度和精度权衡,变异推论(VI)往往是最可取的选择。不幸的是,VI对这些子体的乘数和功能形式都做出了强有力的假设。在这项工作中,我们提议一个新的非参数性变差近似值,不假定近似子星的功能形式,让从业者能够说明算法应该尊重或打破的确切依赖性。这个方法依靠一种新的Langevin型算法,这种算法依靠的是经修改的能源功能,即潜在变量的某些部分平均高于马可夫链早期迭代的样本。这样,统计依赖性可以以一种控制的方式打破,使链条能够更快地混合。这个办法可以进一步修改,以“抛出”的方式,导致更大的伸缩性。我们测试了我们的ResNet-20系统在CIFAR-10、SVHN和FMISM-MC上的具体依赖性。在所有案例中,我们发现在速度和最终趋同率方面,我们发现我们改进了ResNet-20计划在CIFAR-10、SVHN和FMIS-MC-VI。

0

相关内容

Markov

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

分数阶微分方程解的研究

国家自然科学基金

0+阅读 · 2015年12月31日

统计收敛的测度理论与超滤子收敛

国家自然科学基金

0+阅读 · 2014年12月31日

有理映射的参数空间

国家自然科学基金

0+阅读 · 2013年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

二向性反射分布函数的先验知识耦合式融合方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

非线性软测量系统递推量子随机滤波方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

随机变分不等式

国家自然科学基金

0+阅读 · 2011年12月31日

带测量误差变量的广义部分线性变系数模型的估计

国家自然科学基金

1+阅读 · 2011年12月31日

Constructing unbiased gradient estimators with finite variance for conditional stochastic optimization

Arxiv

0+阅读 · 2022年9月13日

Convergence of Batch Stochastic Gradient Descent Methods with Approximate Gradients and/or Noisy Measurements: Theory and Computational Results

Convergence of Batch Stochastic Gradient Descent Methods with Approximate Gradients and/or Noisy Measurements: Theory and Computational Results

Arxiv

0+阅读 · 2022年9月12日

Amortised Inference in Structured Generative Models with Explaining Away

Arxiv

0+阅读 · 2022年9月12日

Robust estimation for functional quadratic regression models

Arxiv

0+阅读 · 2022年9月11日

Batch Bayesian Optimization via Particle Gradient Flows

Arxiv

0+阅读 · 2022年9月10日

Expected Worst Case Regret via Stochastic Sequential Covering

Arxiv

0+阅读 · 2022年9月9日

Differential Privacy Dynamics of Langevin Diffusion and Noisy Gradient Descent

Arxiv

0+阅读 · 2022年9月9日

Stochastic Compositional Optimization with Compositional Constraints

Arxiv

0+阅读 · 2022年9月9日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

VIP会员

文章信息

相关主题

马尔可夫链

视觉识别系统

相关VIP内容

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

不可错过！700+ppt《因果推理》课程！杜克大学Fan Li教程

专知会员服务

72+阅读 · 2022年7月11日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

【经典书】线性代数，436页pdf

专知会员服务

77+阅读 · 2021年3月16日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

人工智能治理的未来

模态感知的特征匹配：单一模态与跨模态技术的全面综述

无监督行人重识别研究综述

【牛津博士论文】面向神经影像应用的可扩展且可解释的空间模型

相关资讯

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

vae 相关论文表示学习 1

vae 相关论文表示学习 1

CreateAMind

12+阅读 · 2018年9月6日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

【论文推荐】最新十篇目标跟踪相关论文—多帧光流跟踪、动态图学习、MV-YOLO、姿态估计、深度核相关滤波、Benchmark

专知

13+阅读 · 2018年5月26日

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

【论文推荐】最新六篇对抗自编码器相关论文—多尺度网络节点表示、生成对抗自编码、逆映射、Wasserstein、条件对抗、去噪

专知

20+阅读 · 2018年4月7日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Capsule Networks解析

Capsule Networks解析

机器学习研究会

11+阅读 · 2017年11月12日

相关论文

Constructing unbiased gradient estimators with finite variance for conditional stochastic optimization

Arxiv

0+阅读 · 2022年9月13日

Convergence of Batch Stochastic Gradient Descent Methods with Approximate Gradients and/or Noisy Measurements: Theory and Computational Results

Convergence of Batch Stochastic Gradient Descent Methods with Approximate Gradients and/or Noisy Measurements: Theory and Computational Results

Arxiv

0+阅读 · 2022年9月12日

Amortised Inference in Structured Generative Models with Explaining Away

Arxiv

0+阅读 · 2022年9月12日

Robust estimation for functional quadratic regression models

Arxiv

0+阅读 · 2022年9月11日

Batch Bayesian Optimization via Particle Gradient Flows

Arxiv

0+阅读 · 2022年9月10日

Expected Worst Case Regret via Stochastic Sequential Covering

Arxiv

0+阅读 · 2022年9月9日

Differential Privacy Dynamics of Langevin Diffusion and Noisy Gradient Descent

Arxiv

0+阅读 · 2022年9月9日

Stochastic Compositional Optimization with Compositional Constraints

Arxiv

0+阅读 · 2022年9月9日

Hierarchical Graph Pooling with Structure Learning

Arxiv

13+阅读 · 2019年11月14日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

相关基金

概率和平均框架下一系列Sobolev空间中的函数逼近与恢复

国家自然科学基金

1+阅读 · 2015年12月31日

分数阶微分方程解的研究

国家自然科学基金

0+阅读 · 2015年12月31日

统计收敛的测度理论与超滤子收敛

国家自然科学基金

0+阅读 · 2014年12月31日

有理映射的参数空间

国家自然科学基金

0+阅读 · 2013年12月31日

神经网络随机学习算法的泛化性研究

国家自然科学基金

2+阅读 · 2013年12月31日

二向性反射分布函数的先验知识耦合式融合方法研究

国家自然科学基金

0+阅读 · 2013年12月31日

高维数据的假设检验

国家自然科学基金

0+阅读 · 2012年12月31日

非线性软测量系统递推量子随机滤波方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

随机变分不等式

国家自然科学基金

0+阅读 · 2011年12月31日

带测量误差变量的广义部分线性变系数模型的估计

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员