在贝叶斯统计中,超参数是先验分布的参数; 该术语用于将它们与所分析的基础系统的模型参数区分开。

VIP内容

当演示专家的潜在奖励功能在任何时候都不能被观察到时,我们解决了在连续控制的背景下模仿学习算法的超参数(HPs)调优的问题。关于模仿学习的大量文献大多认为这种奖励功能适用于HP选择,但这并不是一个现实的设置。事实上,如果有这种奖励功能,就可以直接用于策略训练,而不需要模仿。为了解决这个几乎被忽略的问题,我们提出了一些外部奖励的可能代理。我们对其进行了广泛的实证研究(跨越9个环境的超过10000个代理商),并对选择HP提出了实用的建议。我们的结果表明,虽然模仿学习算法对HP选择很敏感,但通常可以通过奖励功能的代理来选择足够好的HP。

https://www.zhuanzhi.ai/paper/beffdb76305bfa324433d64e6975ec76

成为VIP会员查看完整内容
0
10

最新内容

When hyperparameter optimization of a machine learning algorithm is repeated for multiple datasets it is possible to transfer knowledge to an optimization run on a new dataset. We develop a new hyperparameter-free ensemble model for Bayesian optimization that is a generalization of two existing transfer learning extensions to Bayesian optimization and establish a worst-case bound compared to vanilla Bayesian optimization. Using a large collection of hyperparameter optimization benchmark problems, we demonstrate that our contributions substantially reduce optimization time compared to standard Gaussian process-based Bayesian optimization and improve over the current state-of-the-art for transfer hyperparameter optimization.

0
0
下载
预览

最新论文

When hyperparameter optimization of a machine learning algorithm is repeated for multiple datasets it is possible to transfer knowledge to an optimization run on a new dataset. We develop a new hyperparameter-free ensemble model for Bayesian optimization that is a generalization of two existing transfer learning extensions to Bayesian optimization and establish a worst-case bound compared to vanilla Bayesian optimization. Using a large collection of hyperparameter optimization benchmark problems, we demonstrate that our contributions substantially reduce optimization time compared to standard Gaussian process-based Bayesian optimization and improve over the current state-of-the-art for transfer hyperparameter optimization.

0
0
下载
预览
参考链接
Top
微信扫码咨询专知VIP会员