ARMS:二进制变量的抗乙酸-REINFORCE-多样本-二进制梯度 (ARMS: Antithetic-REINFORCE-Multi-Sample Gradient for Binary Variables) - 专知论文

会员服务 ·

0

估计/估计量 · binary · 蒙特卡罗估计 · 负相关法 · 潜变量/隐变量 ·

2021 年 5 月 28 日

ARMS: Antithetic-REINFORCE-Multi-Sample Gradient for Binary Variables

翻译：ARMS:二进制变量的抗乙酸-REINFORCE-多样本-二进制梯度

Alek Dimitriev,Mingyuan Zhou

Estimating the gradients for binary variables is a task that arises frequently in various domains, such as training discrete latent variable models. What has been commonly used is a REINFORCE based Monte Carlo estimation method that uses either independent samples or pairs of negatively correlated samples. To better utilize more than two samples, we propose ARMS, an Antithetic REINFORCE-based Multi-Sample gradient estimator. ARMS uses a copula to generate any number of mutually antithetic samples. It is unbiased, has low variance, and generalizes both DisARM, which we show to be ARMS with two samples, and the leave-one-out REINFORCE (LOORF) estimator, which is ARMS with uncorrelated samples. We evaluate ARMS on several datasets for training generative models, and our experimental results show that it outperforms competing methods. We also develop a version of ARMS for optimizing the multi-sample variational bound, and show that it outperforms both VIMCO and DisARM. The code is publicly available.

翻译：估算二进制变量的梯度是不同领域经常出现的一项任务,例如培训离散潜伏变量模型。通常使用的是一种基于REINFORCE的蒙特卡洛估算方法,这种方法使用独立样品或相对的负相关样品。为了更好地利用两个以上的样品,我们提议使用ARMS,一个基于抗反再生环境的多样梯度估测器。ARMS使用一个相对抗性样本来生成任何数量的相对抗性样本。它没有偏见,差异较小,并笼统地将我们显示为有两种样品的ARMS和无相干样品的REINFORCE(LOOORF)离岸估计仪(LOORF)和无相干样品的ERMS(LOORCE)估计器。我们用若干数据集来评估ARMS,用于培训基因模型,我们的实验结果显示它优于相互竞争的方法。我们还开发了一个版本的ARMS,以优化多模变装,并显示它优于VIMCO和DARM。代码是公开的。

0

相关内容

估计/估计量

估计/估计量

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

「元强化学习」报告，斯坦福Chelsea Finn讲解，52页ppt，Meta Reinforcement Learning

「元强化学习」报告，斯坦福Chelsea Finn讲解，52页ppt，Meta Reinforcement Learning

专知会员服务

42+阅读 · 2021年1月11日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇行人重识别（ Person Re-ID）相关论文—样本生成、超越人类、实践指南、姿态归一化、图像生成

【论文推荐】最新5篇行人重识别（ Person Re-ID）相关论文—样本生成、超越人类、实践指南、姿态归一化、图像生成

专知

7+阅读 · 2018年2月14日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

Binscatter Regressions

Arxiv

0+阅读 · 2021年8月6日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Latent Multi-task Architecture Learning

Arxiv

3+阅读 · 2018年11月19日

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月17日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Coarse-to-fine Seam Estimation for Image Stitching

Arxiv

4+阅读 · 2018年5月24日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

估计/估计量

蒙特卡罗估计

潜变量/隐变量

相关VIP内容

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

「元强化学习」报告，斯坦福Chelsea Finn讲解，52页ppt，Meta Reinforcement Learning

「元强化学习」报告，斯坦福Chelsea Finn讲解，52页ppt，Meta Reinforcement Learning

专知会员服务

42+阅读 · 2021年1月11日

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

【阿里巴巴达摩院】TResNet: 高性能的GPU专用架构，GPU-Dedicated Architecture

专知会员服务

33+阅读 · 2020年4月1日

【斯坦福大学】Gradient Surgery for Multi-Task Learning

【斯坦福大学】Gradient Surgery for Multi-Task Learning

专知会员服务

47+阅读 · 2020年1月23日

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

在线变分推断，76页ppt，A Regret Bound for Online Variational Inference

专知会员服务

21+阅读 · 2019年12月2日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】利用强化学习与生成模型推进可靠且可泛化的决策

美海军研发“增强侦察与态势评估系统（ARES）”应用程序以优化作战规划（附研究论文）

【NeurIPS2025】DNA-DetectLLM：基于 DNA 启发的“突变-修复”范式揭示 AI 生成文本

面向深度研究系统的强化学习基础：综述

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

逆强化学习-学习人先验的动机

逆强化学习-学习人先验的动机

CreateAMind

16+阅读 · 2019年1月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

【论文推荐】最新5篇行人重识别（ Person Re-ID）相关论文—样本生成、超越人类、实践指南、姿态归一化、图像生成

【论文推荐】最新5篇行人重识别（ Person Re-ID）相关论文—样本生成、超越人类、实践指南、姿态归一化、图像生成

专知

7+阅读 · 2018年2月14日

【推荐】决策树/随机森林深入解析

【推荐】决策树/随机森林深入解析

机器学习研究会

5+阅读 · 2017年9月21日

【学习】Hierarchical Softmax

【学习】Hierarchical Softmax

机器学习研究会

4+阅读 · 2017年8月6日

强化学习 cartpole_a3c

强化学习 cartpole_a3c

CreateAMind

9+阅读 · 2017年7月21日

相关论文

Binscatter Regressions

Arxiv

0+阅读 · 2021年8月6日

Neural Architecture Generator Optimization

Arxiv

6+阅读 · 2020年10月8日

Model-based Adversarial Meta-Reinforcement Learning

Arxiv

5+阅读 · 2020年6月16日

End to end learning and optimization on graphs

Arxiv

7+阅读 · 2019年5月31日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

Latent Multi-task Architecture Learning

Arxiv

3+阅读 · 2018年11月19日

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning

Arxiv

5+阅读 · 2018年9月17日

A Multi-Objective Deep Reinforcement Learning Framework

A Multi-Objective Deep Reinforcement Learning Framework

Arxiv

16+阅读 · 2018年6月27日

Coarse-to-fine Seam Estimation for Image Stitching

Arxiv

4+阅读 · 2018年5月24日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员