The rapid adoption of Small Language Models (SLMs) for resource constrained applications has outpaced our understanding of their ethical and fairness implications. To address this gap, we introduce the Vacuous Neutrality Framework (VaNeu), a multi-dimensional evaluation paradigm designed to assess SLM fairness prior to deployment. The framework examines model robustness across four stages - biases, utility, ambiguity handling, and positional bias over diverse social bias categories. To the best of our knowledge, this work presents the first large-scale audit of SLMs in the 0.5-5B parameter range, an overlooked "middle tier" between BERT-class encoders and flagship LLMs. We evaluate nine widely used SLMs spanning four model families under both ambiguous and disambiguated contexts. Our findings show that models demonstrating low bias in early stages often fail subsequent evaluations, revealing hidden vulnerabilities and unreliable reasoning. These results underscore the need for a more comprehensive understanding of fairness and reliability in SLMs, and position the proposed framework as a principled tool for responsible deployment in socially sensitive settings.
翻译:小型语言模型(SLMs)在资源受限应用中的快速普及已超越了我们对其伦理与公平性影响的理解。为填补这一空白,我们提出了空洞中立性框架(VaNeu),这是一个多维评估范式,旨在部署前评估SLMs的公平性。该框架通过四个阶段检验模型的鲁棒性——涵盖不同社会偏见类别的偏见、效用、歧义处理及立场偏见。据我们所知,本研究首次对参数规模在0.5-5B之间、介于BERT类编码器与旗舰LLMs之间常被忽视的“中间层级”SLMs进行了大规模审计。我们在歧义和消歧两种语境下评估了涵盖四个模型系列的九种常用SLMs。研究结果表明,在早期阶段表现出低偏见的模型常在后继评估中失效,暴露出隐藏的脆弱性和不可靠的推理能力。这些结果强调了对SLMs公平性与可靠性更全面理解的必要性,并将所提框架定位为在社会敏感场景中负责任部署的原则性工具。