最佳化欺骗了我们,如何阻止它 (Hyperparameter Optimization Is Deceiving Us, and How to Stop It)

While hyperparameter optimization (HPO) is known to greatly impact learning algorithm performance, it is often treated as an empirical afterthought. Recent empirical works have highlighted the risk of this second-rate treatment of HPO. They show that inconsistent performance results, based on choice of hyperparameter subspace to search, are a widespread problem in ML research. When comparing two algorithms, J and K searching one subspace can yield the conclusion that J outperforms K, whereas searching another can entail the opposite result. In short, your choice of hyperparameters can deceive you. We provide a theoretical complement to this prior work: We analytically characterize this problem, which we term hyperparameter deception, and show that grid search is inherently deceptive. We prove a defense with guarantees against deception, and demonstrate a defense in practice.

翻译：虽然已知超参数优化(HPO)会极大地影响学习算法的性能,但通常被视为事后经验。最近的实证工作突显了这种二流处理HPO的风险。它们表明基于选择超参数子空间进行搜索的不一致的性能结果是ML研究的一个普遍问题。在比较两个算法时,J和K搜索一个子空间可以得出J优于K的结论,而搜索另一个小空间则会产生相反的结果。简言之,你选择超参数可以欺骗你。我们为先前的这项工作提供了理论补充:我们用分析来定性这一问题,我们称之为超参数欺骗,并表明电网搜索本质上是欺骗性的。我们证明有防范欺骗的保证,并在实践中证明有防御。