This paper deals with the grouped variable selection problem. A widely used strategy is to augment the negative log-likelihood function with a sparsity-promoting penalty. Existing methods include the group Lasso, group SCAD, and group MCP. The group Lasso solves a convex optimization problem but is plagued by underestimation bias. The group SCAD and group MCP avoid this estimation bias but require solving a nonconvex optimization problem that may be plagued by suboptimal local optima. In this work, we propose an alternative method based on the generalized minimax concave (GMC) penalty, which is a folded concave penalty that maintains the convexity of the objective function. We develop a new method for grouped variable selection in linear regression, the group GMC, that generalizes the strategy of the original GMC estimator. We present an efficient algorithm for computing the group GMC estimator and also prove properties of the solution path to guide its numerical computation and tuning parameter selection in practice. We establish error bounds for both the group GMC and original GMC estimators. A rich set of simulation studies and a real data application indicate that the proposed group GMC approach outperforms existing methods in several different aspects under a wide array of scenarios.
翻译:本文涉及分组变量选择问题 。 广泛使用的战略是用宽度促进罚款来增加负日志类函数, 以宽度促进处罚 。 现有方法包括 Lasso 组、 SCAD 组和 MCP 组 。 Lasso 组解决了 convex 优化问题,但受到低估偏差的困扰。 SCAD 组和 MCP 组避免了这一估算偏差, 但也要求解决一个可能受到亚优度本地选择法困扰的非 convelx 优化问题 。 在这项工作中, 我们建议了一种基于普遍小型混合(GMC) 处罚的替代方法, 这是一种折叠式共振量惩罚, 维持目标功能的共性 。 我们为线性回归中的组合变量选择开发了新方法, 组合C 组( GMC ) 组( GMC) 组( GMC) 组( GMC ) 组( ) 和 数组( GMC) 组( GM) 组( ) 数组( 数组) 组( 数组) 数组( 数组) 数组( ) 数组( ) 数组( 数组) ) 数组( 数组) 数组( 数组) 数组) 数组( 数组) 数组) 数组( ) 数组( ) ) ) 数研究中, 。