Out-of-distribution (OOD) detection is essential for determining when a supervised model encounters inputs that differ meaningfully from its training distribution. While widely studied in classification, OOD detection for regression and survival analysis remains limited due to the absence of discrete labels and the challenge of quantifying predictive uncertainty. We introduce a framework for OOD detection that is simultaneously model aware and subspace aware, and that embeds variable prioritization directly into the detection step. The method uses the fitted predictor to construct localized neighborhoods around each test case that emphasize the features driving the model's learned relationship and downweight directions that are less relevant to prediction. It produces OOD scores without relying on global distance metrics or estimating the full feature density. The framework is applicable across outcome types, and in our implementation we use random forests, where the rule structure yields transparent neighborhoods and effective scoring. Experiments on synthetic and real data benchmarks designed to isolate functional shifts show consistent improvements over existing methods. We further demonstrate the approach in an esophageal cancer survival study, where distribution shifts related to lymphadenectomy identify patterns relevant to surgical guidelines.
翻译:分布外检测对于判断监督模型何时遇到与训练分布存在显著差异的输入至关重要。尽管在分类任务中已被广泛研究,但由于缺乏离散标签及量化预测不确定性的挑战,回归与生存分析中的分布外检测研究仍较为有限。本文提出一种同时具备模型感知与子空间感知能力的分布外检测框架,该框架将变量优先级直接嵌入检测步骤。该方法利用已拟合的预测器为每个测试样本构建局部邻域,突出驱动模型学习关系的特征维度,并降低与预测关联较弱的方向的权重。该框架不依赖全局距离度量或全特征密度估计即可生成分布外评分,适用于多种结果类型。在具体实现中,我们采用随机森林算法,其规则结构可生成透明的邻域与有效的评分机制。针对功能偏移设计的合成与真实数据基准实验表明,本方法相较于现有技术具有持续改进效果。我们进一步在食管癌生存研究中验证了该方法的有效性,其中与淋巴结清扫术相关的分布偏移识别出与外科指南相关的模式。