在线内核选择中基于赌博反馈的改进遗憾界 (Improved Regret Bounds for Online Kernel Selection under Bandit Feedback) - 专知论文

会员服务 ·

0

赌博机/老虎机 · 核化 · Lipschitz · 在线 · 损失 ·

2023 年 3 月 23 日

Improved Regret Bounds for Online Kernel Selection under Bandit Feedback

翻译：在线内核选择中基于赌博反馈的改进遗憾界

Junfan Li,Shizhong Liao

In this paper, we improve the regret bound for online kernel selection under bandit feedback. Previous algorithm enjoys a $O((\Vert f\Vert^2_{\mathcal{H}_i}+1)K^{\frac{1}{3}}T^{\frac{2}{3}})$ expected bound for Lipschitz loss functions. We prove two types of regret bounds improving the previous bound. For smooth loss functions, we propose an algorithm with a $O(U^{\frac{2}{3}}K^{-\frac{1}{3}}(\sum^K_{i=1}L_T(f^\ast_i))^{\frac{2}{3}})$ expected bound where $L_T(f^\ast_i)$ is the cumulative losses of optimal hypothesis in $\mathbb{H}_{i}=\{f\in\mathcal{H}_i:\Vert f\Vert_{\mathcal{H}_i}\leq U\}$. The data-dependent bound keeps the previous worst-case bound and is smaller if most of candidate kernels match well with the data. For Lipschitz loss functions, we propose an algorithm with a $O(U\sqrt{KT}\ln^{\frac{2}{3}}{T})$ expected bound asymptotically improving the previous bound. We apply the two algorithms to online kernel selection with time constraint and prove new regret bounds matching or improving the previous $O(\sqrt{T\ln{K}} +\Vert f\Vert^2_{\mathcal{H}_i}\max\{\sqrt{T},\frac{T}{\sqrt{\mathcal{R}}}\})$ expected bound where $\mathcal{R}$ is the time budget. Finally, we empirically verify our algorithms on online regression and classification tasks.

翻译：在本文中，我们改进了在线内核选择中基于赌博反馈的遗憾界。前一个算法针对Lipschitz损失函数，享有$O((\Vert f\Vert^2_{\mathcal{H}_i}+1)K^{\frac{1}{3}}T^{\frac{2}{3}})$的预期界限。我们证明了两种遗憾界限类型，改善了以前的界。对于平滑损失函数，我们提出了一种算法，具有$O(U^{\frac{2}{3}}K^{-\frac{1}{3}}(\sum^K_{i=1}L_T(f^\ast_i))^{\frac{2}{3}})$的预期界限，其中$L_T(f^\ast_i)$是最优假设的累计损失函数，$\mathbb{H}_{i}=\{f\in\mathcal{H}_i:\Vert f\Vert_{\mathcal{H}_i}\leq U\}$。数据相关的预期界限保持先前的最坏情况下的预期界限，在大多数候选内核与数据匹配良好时更小。对于Lipschitz损失函数，我们提出了一种算法，具有$O(U\sqrt{KT}\ln^{\frac{2}{3}}{T})$的预期界限，从渐进意义上改善了以前的界。我们将这两种算法应用于具有时间限制的在线内核选择，并证明了新的遗憾界，与以前的$O(\sqrt{T\ln{K}} +\Vert f\Vert^2_{\mathcal{H}_i}\max\{\sqrt{T},\frac{T}{\sqrt{\mathcal{R}}}\})$的预期界限匹配或改进，其中$\mathcal{R}$是时间预算。最后，我们在在线回归和分类任务上实证验证了我们的算法。

0

相关内容

赌博机/老虎机

赌博机/老虎机

【ICDM2022教程】多目标优化与推荐，173页ppt

【ICDM2022教程】多目标优化与推荐，173页ppt

专知会员服务

41+阅读 · 2022年12月24日

JCIM丨DRlinker：深度强化学习优化片段连接设计

JCIM丨DRlinker：深度强化学习优化片段连接设计

专知会员服务

6+阅读 · 2022年12月9日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

131+阅读 · 2021年6月16日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

54+阅读 · 2020年5月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

专知会员服务

27+阅读 · 2020年2月22日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

177+阅读 · 2020年2月1日

【微软&CMU】后向特征校正，深度学习如何深度学习？Backward Feature Correction: How Deep Learning Performs Deep Learning

专知会员服务

12+阅读 · 2020年1月18日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

41+阅读 · 2020年1月15日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

169+阅读 · 2019年10月11日

特征筛选还在用XGB的Feature Importance？试试Permutation Importance

特征筛选还在用XGB的Feature Importance？试试Permutation Importance

PaperWeekly

0+阅读 · 2022年9月30日

一文带你浏览Graph Transformers

一文带你浏览Graph Transformers

图与推荐

1+阅读 · 2022年7月14日

【NeurIPS 2020 Tutorial】离线强化学习:从算法到挑战，80页ppt

【NeurIPS 2020 Tutorial】离线强化学习:从算法到挑战，80页ppt

专知

15+阅读 · 2020年12月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

强化学习初探 - 从多臂老虎机问题说起

强化学习初探 - 从多臂老虎机问题说起

专知

10+阅读 · 2018年4月3日

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

某些随机非线性发展方程组的动力学行为

国家自然科学基金

0+阅读 · 2013年12月31日

复杂结构的复变函数半解析灵敏度求解方法

国家自然科学基金

0+阅读 · 2012年12月31日

受限制策略下多臂Bandit过程的理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

垃圾邮件过滤的优化目标、建模及顺序回归研究

国家自然科学基金

0+阅读 · 2009年12月31日

语言环境下群体共识过程的优化研究

国家自然科学基金

0+阅读 · 2008年12月31日

缓存交换机确保时限调度的NP-C问题新解法

国家自然科学基金

0+阅读 · 2008年12月31日

Federated X-Armed Bandit

Arxiv

0+阅读 · 2023年5月15日

Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs

Arxiv

0+阅读 · 2023年5月15日

The Sharp Power Law of Local Search on Expanders

Arxiv

0+阅读 · 2023年5月14日

Sampling recovery in the uniform norm

Arxiv

0+阅读 · 2023年5月12日

On the Optimality of Misspecified Kernel Ridge Regression

Arxiv

0+阅读 · 2023年5月12日

Revisiting Graph Persistence for Updates and Efficiency

Arxiv

0+阅读 · 2023年5月11日

Adaptive Privacy-Preserving Coded Computing With Hierarchical Task Partitioning

Arxiv

0+阅读 · 2023年5月11日

Multi-agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

Arxiv

0+阅读 · 2023年5月10日

Lower Generalization Bounds for GD and SGD in Smooth Stochastic Convex Optimization

Arxiv

0+阅读 · 2023年5月10日

Convergence of a Normal Map-based Prox-SGD Method under the KL Inequality

Arxiv

0+阅读 · 2023年5月10日

VIP会员

文章信息

相关主题

赌博机/老虎机

相关VIP内容

【ICDM2022教程】多目标优化与推荐，173页ppt

【ICDM2022教程】多目标优化与推荐，173页ppt

专知会员服务

41+阅读 · 2022年12月24日

JCIM丨DRlinker：深度强化学习优化片段连接设计

JCIM丨DRlinker：深度强化学习优化片段连接设计

专知会员服务

6+阅读 · 2022年12月9日

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

131+阅读 · 2021年6月16日

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

【CMU博士论文】用动态超参数优化改进深度学习训练和推理，Improving Deep Learning Training and Inference with Dynamic Hyperparameter Optimization

专知会员服务

54+阅读 · 2020年5月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

92+阅读 · 2020年3月12日

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

【谷歌大脑新论文】利用可微摄动优化器进行学习，Learning with Differentiable Perturbed Optimizers

专知会员服务

27+阅读 · 2020年2月22日

深度强化学习策略梯度教程，53页ppt

深度强化学习策略梯度教程，53页ppt

专知会员服务

177+阅读 · 2020年2月1日

【微软&CMU】后向特征校正，深度学习如何深度学习？Backward Feature Correction: How Deep Learning Performs Deep Learning

专知会员服务

12+阅读 · 2020年1月18日

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

【强化学习论文推荐集合】2019年必读的10篇TOP强化学习论文，My Top 10 Deep RL Papers of 2019

专知会员服务

41+阅读 · 2020年1月15日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

169+阅读 · 2019年10月11日

热门VIP内容

相关资讯

特征筛选还在用XGB的Feature Importance？试试Permutation Importance

特征筛选还在用XGB的Feature Importance？试试Permutation Importance

PaperWeekly

0+阅读 · 2022年9月30日

一文带你浏览Graph Transformers

一文带你浏览Graph Transformers

图与推荐

1+阅读 · 2022年7月14日

【NeurIPS 2020 Tutorial】离线强化学习:从算法到挑战，80页ppt

【NeurIPS 2020 Tutorial】离线强化学习:从算法到挑战，80页ppt

专知

15+阅读 · 2020年12月9日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

26+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

笔记 | Deep active learning for named entity recognition

笔记 | Deep active learning for named entity recognition

黑龙江大学自然语言处理实验室

24+阅读 · 2018年5月27日

强化学习初探 - 从多臂老虎机问题说起

强化学习初探 - 从多臂老虎机问题说起

专知

10+阅读 · 2018年4月3日

相关论文

Federated X-Armed Bandit

Arxiv

0+阅读 · 2023年5月15日

Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs

Arxiv

0+阅读 · 2023年5月15日

The Sharp Power Law of Local Search on Expanders

Arxiv

0+阅读 · 2023年5月14日

Sampling recovery in the uniform norm

Arxiv

0+阅读 · 2023年5月12日

On the Optimality of Misspecified Kernel Ridge Regression

Arxiv

0+阅读 · 2023年5月12日

Revisiting Graph Persistence for Updates and Efficiency

Arxiv

0+阅读 · 2023年5月11日

Adaptive Privacy-Preserving Coded Computing With Hierarchical Task Partitioning

Arxiv

0+阅读 · 2023年5月11日

Multi-agent Reinforcement Learning: Asynchronous Communication and Linear Function Approximation

Arxiv

0+阅读 · 2023年5月10日

Lower Generalization Bounds for GD and SGD in Smooth Stochastic Convex Optimization

Arxiv

0+阅读 · 2023年5月10日

Convergence of a Normal Map-based Prox-SGD Method under the KL Inequality

Arxiv

0+阅读 · 2023年5月10日

相关基金

方差正则化的分类模型选择方法研究

国家自然科学基金

1+阅读 · 2015年12月31日

非均质量子器件Schr？dinger-Poisson系统多尺度分析与算法研究

国家自然科学基金

0+阅读 · 2014年12月31日

某些随机非线性发展方程组的动力学行为

国家自然科学基金

0+阅读 · 2013年12月31日

复杂结构的复变函数半解析灵敏度求解方法

国家自然科学基金

0+阅读 · 2012年12月31日

受限制策略下多臂Bandit过程的理论与应用研究

国家自然科学基金

0+阅读 · 2012年12月31日

改进Max-SAT算法的关键技术研究

国家自然科学基金

0+阅读 · 2009年12月31日

线性积分方程的Galerkin快速谱方法

国家自然科学基金

0+阅读 · 2009年12月31日

垃圾邮件过滤的优化目标、建模及顺序回归研究

国家自然科学基金

0+阅读 · 2009年12月31日

语言环境下群体共识过程的优化研究

国家自然科学基金

0+阅读 · 2008年12月31日

缓存交换机确保时限调度的NP-C问题新解法

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员