2025年OpenAI准备框架无法确保任何AI风险缓解实践：AI安全政策可供性分析的原理验证 (The 2025 OpenAI Preparedness Framework does not guarantee any AI risk mitigation practices: a proof-of-concept for affordance analyses of AI safety policies)

The 2025 OpenAI Preparedness Framework does not guarantee any AI risk mitigation practices: a proof-of-concept for affordance analyses of AI safety policies

翻译：2025年OpenAI准备框架无法确保任何AI风险缓解实践：AI安全政策可供性分析的原理验证

Sam Coggins,Alex Saeri,Katherine A. Daniell,Lorenn P. Ruster,Jessie Liu,Jenny L. Davis

from arxiv, 19 pages, 5 tables, 1 figure

Prominent AI companies are producing 'safety frameworks' as a type of voluntary self-governance. These statements purport to establish risk thresholds and safety procedures for the development and deployment of highly capable AI. Understanding which AI risks are covered and what actions are allowed, refused, demanded, encouraged, or discouraged by these statements is vital for assessing how these frameworks actually govern AI development and deployment. We draw on affordance theory to analyse the OpenAI 'Preparedness Framework Version 2' (April 2025) using the Mechanisms & Conditions model of affordances and the MIT AI Risk Repository. We find that this safety policy requests evaluation of a small minority of AI risks, encourages deployment of systems with 'Medium' capabilities for what OpenAI itself defines as 'severe harm' (potential for >1000 deaths or >$100B in damages), and allows OpenAI's CEO to deploy even more dangerous capabilities. These findings suggest that effective mitigation of AI risks requires more robust governance interventions beyond current industry self-regulation. Our affordance analysis provides a replicable method for evaluating what safety frameworks actually permit versus what they claim.

翻译：知名AI公司正在制定"安全框架"作为自愿性自我治理的一种形式。这些声明声称要为高能力AI的开发与部署建立风险阈值和安全程序。理解这些声明涵盖哪些AI风险、允许/拒绝/要求/鼓励/劝阻何种行为，对于评估这些框架如何实际治理AI开发与部署至关重要。我们运用可供性理论，采用"机制与条件"可供性模型和MIT AI风险知识库，对OpenAI《准备框架2.0版》（2025年4月）进行分析。研究发现：该安全政策仅要求评估极少数AI风险；鼓励部署被OpenAI自身定义为"严重危害"（可能导致>1000人死亡或>$1000亿美元损失）的"中等"能力系统；并允许OpenAI首席执行官部署更具危险性的能力。这些发现表明，有效缓解AI风险需要比当前行业自律更强大的治理干预措施。我们的可供性分析为评估安全框架的实际许可范围与其宣称内容之间的差异提供了可复现的方法。

相关内容

关注 7077

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

163+阅读 · 2019年10月12日