In this monograph, I introduce the basic concepts of Online Learning through a modern view of Online Convex Optimization. Here, online learning refers to the framework of regret minimization under worst-case assumptions. I present first-order and second-order algorithms for online learning with convex losses, in Euclidean and non-Euclidean settings. All the algorithms are clearly presented as instantiation of Online Mirror Descent or Follow-The-Regularized-Leader and their variants. Particular attention is given to the issue of tuning the parameters of the algorithms and learning in unbounded domains, through adaptive and parameter-free online learning algorithms. Non-convex losses are dealt through convex surrogate losses and through randomization. The bandit setting is also briefly discussed, touching on the problem of adversarial and stochastic multi-armed bandits. These notes do not require prior knowledge of convex analysis and all the required mathematical tools are rigorously explained. Moreover, all the included proofs have been carefully chosen to be as simple and as short as possible.
翻译:本专著通过在线凸优化的现代视角,介绍在线学习的基本概念。此处的在线学习特指最坏情况假设下的遗憾最小化框架。针对凸损失函数,我分别介绍了欧几里得与非欧几里得空间中的一阶与二阶在线学习算法。所有算法均被清晰地表述为在线镜像下降法或其变体,以及跟随正则化领导者算法或其变体的具体实例。通过自适应与无参数在线学习算法,特别关注了算法参数调优与无界域学习问题。针对非凸损失函数,通过凸代理损失与随机化方法进行处理。同时简要讨论了赌博机设定,涉及对抗性与随机性多臂赌博机问题。本讲义不要求读者具备凸分析预备知识,所有必需的数学工具均经过严谨阐释。此外,所有证明均经过精心筛选,力求简洁明了。