谁写什么：揭示作者角色对AI生成文本检测的影响 (Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection)

The rise of Large Language Models (LLMs) necessitates accurate AI-generated text detection. However, current approaches largely overlook the influence of author characteristics. We investigate how sociolinguistic attributes-gender, CEFR proficiency, academic field, and language environment-impact state-of-the-art AI text detectors. Using the ICNALE corpus of human-authored texts and parallel AI-generated texts from diverse LLMs, we conduct a rigorous evaluation employing multi-factor ANOVA and weighted least squares (WLS). Our results reveal significant biases: CEFR proficiency and language environment consistently affected detector accuracy, while gender and academic field showed detector-dependent effects. These findings highlight the crucial need for socially aware AI text detection to avoid unfairly penalizing specific demographic groups. We offer novel empirical evidence, a robust statistical framework, and actionable insights for developing more equitable and reliable detection systems in real-world, out-of-domain contexts. This work paves the way for future research on bias mitigation, inclusive evaluation benchmarks, and socially responsible LLM detectors.

翻译：大型语言模型（LLMs）的兴起使得准确检测AI生成文本变得至关重要。然而，现有方法大多忽视了作者特征的影响。本研究探讨了社会语言学属性——性别、CEFR语言能力、学术领域和语言环境——如何影响最先进的AI文本检测器。通过使用ICNALE语料库中的人工撰写文本以及来自不同LLMs的并行AI生成文本，我们采用多因素方差分析和加权最小二乘法（WLS）进行了严格评估。结果表明存在显著偏差：CEFR语言能力和语言环境持续影响检测器准确率，而性别和学术领域的影响则因检测器而异。这些发现强调了开发具有社会意识的AI文本检测系统的迫切需求，以避免对特定人口群体造成不公正的惩罚。我们提供了新颖的经验证据、稳健的统计框架以及可行的见解，以促进在现实世界、跨领域情境中开发更公平、更可靠的检测系统。本研究为未来在偏差缓解、包容性评估基准和社会责任型LLM检测器方面的研究铺平了道路。

相关内容

关注 7077

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【NeurIPS2025】DNA-DetectLLM：基于 DNA 启发的“突变-修复”范式揭示 AI 生成文本

专知会员服务

12+阅读 · 9月22日