Prominent AI companies are producing 'safety frameworks' as a type of voluntary self-governance. These statements purport to establish risk thresholds and safety procedures for the development and deployment of highly capable AI. Understanding which AI risks are covered and what actions are allowed, refused, demanded, encouraged, or discouraged by these statements is vital for assessing how these frameworks actually govern AI development and deployment. We draw on affordance theory to analyse the OpenAI 'Preparedness Framework Version 2' (April 2025) using the Mechanisms & Conditions model of affordances and the MIT AI Risk Repository. We find that this safety policy requests evaluation of a small minority of AI risks, encourages deployment of systems with 'Medium' capabilities for what OpenAI itself defines as 'severe harm' (potential for >1000 deaths or >$100B in damages), and allows OpenAI's CEO to deploy even more dangerous capabilities. These findings suggest that effective mitigation of AI risks requires more robust governance interventions beyond current industry self-regulation. Our affordance analysis provides a replicable method for evaluating what safety frameworks actually permit versus what they claim.
翻译:知名AI公司正在制定"安全框架"作为自愿性自我治理的一种形式。这些声明声称要为高能力AI的开发与部署建立风险阈值和安全程序。理解这些声明涵盖哪些AI风险、允许/拒绝/要求/鼓励/劝阻何种行为,对于评估这些框架如何实际治理AI开发与部署至关重要。我们运用可供性理论,采用"机制与条件"可供性模型和MIT AI风险知识库,对OpenAI《准备框架2.0版》(2025年4月)进行分析。研究发现:该安全政策仅要求评估极少数AI风险;鼓励部署被OpenAI自身定义为"严重危害"(可能导致>1000人死亡或>$1000亿美元损失)的"中等"能力系统;并允许OpenAI首席执行官部署更具危险性的能力。这些发现表明,有效缓解AI风险需要比当前行业自律更强大的治理干预措施。我们的可供性分析为评估安全框架的实际许可范围与其宣称内容之间的差异提供了可复现的方法。