平滑双层编程偏偏常规化程序 (Smooth Bilevel Programming for Sparse Regularization)

Iteratively reweighted least square (IRLS) is a popular approach to solve sparsity-enforcing regression problems in machine learning. State of the art approaches are more efficient but typically rely on specific coordinate pruning schemes. In this work, we show how a surprisingly simple reparametrization of IRLS, coupled with a bilevel resolution (instead of an alternating scheme) is able to achieve top performances on a wide range of sparsity (such as Lasso, group Lasso and trace norm regularizations), regularization strength (including hard constraints), and design matrices (ranging from correlated designs to differential operators). Similarly to IRLS, our method only involves linear systems resolutions, but in sharp contrast, corresponds to the minimization of a smooth function. Despite being non-convex, we show that there is no spurious minima and that saddle points are "ridable", so that there always exists a descent direction. We thus advocate for the use of a BFGS quasi-Newton solver, which makes our approach simple, robust and efficient. We perform a numerical benchmark of the convergence speed of our algorithm against state of the art solvers for Lasso, group Lasso, trace norm and linearly constrained problems. These results highlight the versatility of our approach, removing the need to use different solvers depending on the specificity of the ML problem under study.

翻译：循环加权最小平方( IRLS) 是解决机器学习中的超度强化回归问题的流行方法。艺术状态方法效率更高, 但通常依赖具体的协调调整计划。在这项工作中, 我们展示了如何在广泛的空间( 如Lasso、 Lasso 和追踪规范规范规范规范规范规范化)、正规化强力( 包括硬性约束) 和设计矩阵( 从相关设计到不同操作者) 上取得顶级性表现。与IRLS 类似, 我们的方法只涉及线性系统分辨率, 但却有强烈对比, 与一个最优功能的最小化相对。尽管不是混凝土, 我们却显示没有令人惊讶的迷你, 并且马鞍点是“ 易行的 ” 。因此我们主张使用BFGS 准Newton 解决方案, 这使我们的方法简单、强大和高效。我们用一个数字基准来衡量我们系统方法的趋同速度, 从而将解析法的进度速度与解析器的分辨率联系起来。