VIP内容

主题: Locally Differentially Private (Contextual) Bandits Learning

摘要:

首先,我们提出了一种简单的黑盒归约框架,该框架可以解决带有LDP保证的大量无背景的bandits学习问题。根据我们的框架,我们可以通过单点反馈(例如 private bandits凸优化等)改善private bandits学习的最佳结果,并在LDP下获得具有多点反馈的BCO的第一结果。 LDP保证和黑盒特性使我们的框架在实际应用中比以前专门设计的和相对较弱的差分专用(DP)上下文无关强盗算法更具吸引力。此外,我们还将算法扩展到在(ε,δ)-LDP下具有遗憾约束ō(T~3/4 /ε)的广义线性bandits,这被认为是最优的。注意,给定DP上下文线性bandits的现有Ω(T)下界,我们的结果表明LDP和DP上下文bandits之间的根本区别。

成为VIP会员查看完整内容
0
8

最新内容

We explore whether quantum advantages can be found for the zeroth-order online convex optimization problem, which is also known as bandit convex optimization with multi-point feedback. In this setting, given access to zeroth-order oracles (that is, the loss function is accessed as a black box that returns the function value for any queried input), a player attempts to minimize a sequence of adversarially generated convex loss functions. This procedure can be described as a $T$ round iterative game between the player and the adversary. In this paper, we present quantum algorithms for the problem and show for the first time that potential quantum advantages are possible for problems of online convex optimization. Specifically, our contributions are as follows. (i) When the player is allowed to query zeroth-order oracles $O(1)$ times in each round as feedback, we give a quantum algorithm that achieves $O(\sqrt{T})$ regret without additional dependence of the dimension $n$, which outperforms the already known optimal classical algorithm only achieving $O(\sqrt{nT})$ regret. Note that the regret of our quantum algorithm has achieved the lower bound of classical first-order methods. (ii) We show that for strongly convex loss functions, the quantum algorithm can achieve $O(\log T)$ regret with $O(1)$ queries as well, which means that the quantum algorithm can achieve the same regret bound as the classical algorithms in the full information setting.

0
0
下载
预览

最新论文

We explore whether quantum advantages can be found for the zeroth-order online convex optimization problem, which is also known as bandit convex optimization with multi-point feedback. In this setting, given access to zeroth-order oracles (that is, the loss function is accessed as a black box that returns the function value for any queried input), a player attempts to minimize a sequence of adversarially generated convex loss functions. This procedure can be described as a $T$ round iterative game between the player and the adversary. In this paper, we present quantum algorithms for the problem and show for the first time that potential quantum advantages are possible for problems of online convex optimization. Specifically, our contributions are as follows. (i) When the player is allowed to query zeroth-order oracles $O(1)$ times in each round as feedback, we give a quantum algorithm that achieves $O(\sqrt{T})$ regret without additional dependence of the dimension $n$, which outperforms the already known optimal classical algorithm only achieving $O(\sqrt{nT})$ regret. Note that the regret of our quantum algorithm has achieved the lower bound of classical first-order methods. (ii) We show that for strongly convex loss functions, the quantum algorithm can achieve $O(\log T)$ regret with $O(1)$ queries as well, which means that the quantum algorithm can achieve the same regret bound as the classical algorithms in the full information setting.

0
0
下载
预览
Top