适用于适应调适的正规M-估计值的衍生物和剩余分布 (Derivatives and residual distribution of regularized M-estimators with application to adaptive tuning)

This paper studies M-estimators with gradient-Lipschitz loss function regularized with convex penalty in linear models with Gaussian design matrix and arbitrary noise distribution. A practical example is the robust M-estimator constructed with the Huber loss and the Elastic-Net penalty and the noise distribution has heavy-tails. Our main contributions are three-fold. (i) We provide general formulae for the derivatives of regularized M-estimators $\hat\beta(y,X)$ where differentiation is taken with respect to both $y$ and $X$; this reveals a simple differentiability structure shared by all convex regularized M-estimators. (ii) Using these derivatives, we characterize the distribution of the residual $r_i = y_i-x_i^\top\hat\beta$ in the intermediate high-dimensional regime where dimension and sample size are of the same order. (iii) Motivated by the distribution of the residuals, we propose a novel adaptive criterion to select tuning parameters of regularized M-estimators. The criterion approximates the out-of-sample error up to an additive constant independent of the estimator, so that minimizing the criterion provides a proxy for minimizing the out-of-sample error. The proposed adaptive criterion does not require the knowledge of the noise distribution or of the covariance of the design. Simulated data confirms the theoretical findings, regarding both the distribution of the residuals and the success of the criterion as a proxy of the out-of-sample error. Finally our results reveal new relationships between the derivatives of $\hat\beta(y,X)$ and the effective degrees of freedom of the M-estimator, which are of independent interest.

翻译：本文研究使用梯度- Lipschitz 损失函数的 M 估计值, 以直线模型中的混血罚款, 使用高斯设计矩阵和任意的噪音分布。一个实际的例子就是用Huber 损失和 Elastic- Net 罚款和噪音分布来构建强大的 M 估计值。我们的主要贡献是三倍。 (一) 我们为常规的 m 估计值 $\ hat\beta(y, X) 的衍生物提供了通用公式, 其中对美元和X 美元进行了区分; 这显示了所有统合的 M- 测量仪所共享的简单度差异结构。 (二) 使用这些衍生物, 我们将剩余 $_ i = y_ x_ i_ itop\ hat\beta 分配值的分布情况定性分为三重。 (三) 受残余值分布的启发, 我们提出了一个新的调控标准, 以选择常规的 m- dealtial 值调值参数。标准是: 最常值最常值的统计值标准, 最常值最短的比值显示比值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值的值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值值