Tabular data drive most real-world machine learning applications, yet building general-purpose models for them remains difficult. Mixed numeric and categorical fields, weak feature structure, and limited labeled data make scaling and generalization challenging. To this end, we introduce Orion-Bix, a tabular foundation model that combines biaxial attention with meta-learned in-context reasoning for few-shot tabular learning. Its encoder alternates standard, grouped, hierarchical, and relational attention, fusing their outputs through multi-CLS summarization to capture both local and global dependencies efficiently. A label-aware ICL head adapts on the fly and scales to large label spaces via hierarchical decision routing. Meta-trained on synthetically generated, structurally diverse tables with causal priors, Orion-Bix learns transferable inductive biases across heterogeneous data. Delivered as a scikit-learn compatible foundation model, it outperforms gradient-boosting baselines and remains competitive with state-of-the-art tabular foundation models on public benchmarks, showing that biaxial attention with episodic meta-training enables robust, few-shot-ready tabular learning. The model is publicly available at https://github.com/Lexsi-Labs/Orion-BiX .
翻译:表格数据驱动着大多数现实世界的机器学习应用,然而为其构建通用模型仍然具有挑战性。混合数值与分类字段、弱特征结构以及有限的标注数据使得模型的扩展与泛化面临困难。为此,我们提出了Orion-Bix,一种结合双轴注意力与元学习上下文推理的表格基础模型,用于少样本表格学习。其编码器交替采用标准、分组、层次化及关系注意力机制,并通过多CLS汇总融合它们的输出,以高效捕获局部与全局依赖关系。一个标签感知的上下文学习头部能够动态适应,并通过层次化决策路由扩展至大规模标签空间。通过在具有因果先验的合成生成、结构多样的表格上进行元训练,Orion-Bix能够学习跨异构数据的可迁移归纳偏置。该模型以兼容scikit-learn的基础模型形式发布,在公开基准测试中超越了梯度提升基线方法,并与最先进的表格基础模型保持竞争力,表明双轴注意力结合情景式元训练能够实现鲁棒、少样本就绪的表格学习。模型已在https://github.com/Lexsi-Labs/Orion-BiX 公开提供。