Student Growth Percentiles (SGPs), widely adopted across U.S. state assessment systems, employ independent quantile regression followed by post-hoc correction using an isotonic projection method (\texttt{isotonize=TRUE} in the \texttt{SGP} R package) to address quantile crossing. We demonstrate this approach contains a fundamental methodological inconsistency: interpolation between independently-estimated, potentially crossed quantiles requires monotonicity, yet the post-hoc correction alters estimates in ways that may violate the quantile property $P(Y \leq \hat{Q}_{\tau}(Y|X) \mid X) = \tau$. We term this the \emph{interpolation paradox}. While theoretically sound constrained joint quantile regression (CJQR) eliminates crossing by enforcing non-crossing constraints during optimization, we analyze its computational complexity (often scaling poorly, e.g., $\mathcal{O}((qn)^3)$ for standard LP solvers) rendering it intractable for large-scale educational data ($n > 100{,}000$). We examine the SGP package's switch to the Frisch-Newton interior point method (\texttt{rq.method.for.large.n="fn"}) for large $N$, noting that while efficient for \emph{independent} QR, it doesn't resolve the joint problem's complexity or the paradox. We propose neural network-based multi-quantile regression (NNQR) with shared hidden layers as a practical alternative. Leveraging the convexity of the composite pinball loss, SGD-based optimization used in NN training can reliably approach the global optimum, offering scalability ($O(n)$) and implicitly reducing crossing. Our empirical analysis shows independent QR yields crossing, while both CJQR and NNQR enforce monotonicity. NNQR emerges as a viable, scalable alternative for operational SGP systems, aligning theoretical validity with computational feasibility.
翻译:学生成长百分位数(SGPs)在美国各州评估系统中被广泛采用,该方法首先进行独立分位数回归,随后采用等渗投影方法(在SGP R包中通过`isotonize=TRUE`参数实现)进行事后校正以解决分位数交叉问题。我们证明该方法存在根本性的方法论不一致性:在独立估计且可能存在交叉的分位数之间进行插值需要单调性,但事后校正会以可能违反分位数性质$P(Y \leq \hat{Q}_{\tau}(Y|X) \mid X) = \tau$的方式改变估计值。我们将此现象称为"插值悖论"。虽然理论上严谨的约束联合分位数回归(CJQR)通过在优化过程中强制施加非交叉约束来消除交叉问题,但我们分析了其计算复杂度(通常扩展性较差,例如标准线性规划求解器可达$\mathcal{O}((qn)^3)$),这使其难以处理大规模教育数据($n > 100{,}000$)。我们考察了SGP包针对大规模数据切换至Frisch-Newton内点法(`rq.method.for.large.n="fn"`)的策略,指出虽然该方法对独立分位数回归具有高效性,但既未解决联合问题的复杂度,也未消除插值悖论。我们提出采用具有共享隐藏层的神经网络多分位数回归(NNQR)作为实用替代方案。利用复合分位数损失函数的凸性特性,基于随机梯度下降的神经网络优化能够可靠地逼近全局最优解,同时提供线性可扩展性($O(n)$)并隐式减少交叉现象。我们的实证分析表明:独立分位数回归会产生交叉,而CJQR与NNQR均能保持单调性。NNQR作为可运行的SGP系统中具有可行性的可扩展替代方案,实现了理论有效性与计算可行性的统一。