In practical data analysis under noisy environment, it is common to first use robust methods to identify outliers, and then to conduct further analysis after removing the outliers. In this paper, we consider statistical inference of the model estimated after outliers are removed, which can be interpreted as a selective inference (SI) problem. To use conditional SI framework, it is necessary to characterize the events of how the robust method identifies outliers. Unfortunately, the existing methods cannot be directly used here because they are applicable to the case where the selection events can be represented by linear/quadratic constraints. In this paper, we propose a conditional SI method for popular robust regressions by using homotopy method. We show that the proposed conditional SI method is applicable to a wide class of robust regression and outlier detection methods and has good empirical performance on both synthetic data and real data experiments.
翻译:在噪音环境中,在实际数据分析中,通常首先使用稳健的方法来查明离子,然后在清除离子后进行进一步分析。在本文中,我们考虑了在清除离子后估计的模型的统计推论,这可以被解释为选择性推论问题。为了使用有条件的SI框架,有必要说明稳健方法如何确定离子的事件。不幸的是,现有方法无法直接在这里使用,因为它们适用于选择事件可以以线性/水性限制为代表的情况。在本文中,我们建议采用有条件的SI方法,使用同质式方法进行流行的稳健回归。我们表明,拟议的有条件的SI方法适用于一系列广泛的稳健回归和异性探测方法,在合成数据和实际数据实验方面都有良好的经验性表现。