大型语言模型中的策略智能：来自演化博弈论的证据 (Strategic Intelligence in Large Language Models: Evidence from evolutionary Game Theory)

Are Large Language Models (LLMs) a new form of strategic intelligence, able to reason about goals in competitive settings? We present compelling supporting evidence. The Iterated Prisoner's Dilemma (IPD) has long served as a model for studying decision-making. We conduct the first ever series of evolutionary IPD tournaments, pitting canonical strategies (e.g., Tit-for-Tat, Grim Trigger) against agents from the leading frontier AI companies OpenAI, Google, and Anthropic. By varying the termination probability in each tournament (the "shadow of the future"), we introduce complexity and chance, confounding memorisation. Our results show that LLMs are highly competitive, consistently surviving and sometimes even proliferating in these complex ecosystems. Furthermore, they exhibit distinctive and persistent "strategic fingerprints": Google's Gemini models proved strategically ruthless, exploiting cooperative opponents and retaliating against defectors, while OpenAI's models remained highly cooperative, a trait that proved catastrophic in hostile environments. Anthropic's Claude emerged as the most forgiving reciprocator, showing remarkable willingness to restore cooperation even after being exploited or successfully defecting. Analysis of nearly 32,000 prose rationales provided by the models reveals that they actively reason about both the time horizon and their opponent's likely strategy, and we demonstrate that this reasoning is instrumental to their decisions. This work connects classic game theory with machine psychology, offering a rich and granular view of algorithmic decision-making under uncertainty.

翻译：大型语言模型（LLMs）是否构成一种新型的策略智能，能够在竞争性环境中对目标进行推理？我们提出了有力的支持证据。迭代囚徒困境（IPD）长期以来一直是研究决策行为的经典模型。我们首次开展了一系列演化IPD锦标赛，将经典策略（如“以牙还牙”、“冷酷触发”）与来自前沿AI公司OpenAI、Google和Anthropic的智能体进行对抗。通过改变每场锦标赛的终止概率（即“未来的阴影”），我们引入了复杂性与随机性，从而避免了单纯记忆的影响。结果表明，LLMs展现出高度的竞争力，能够持续在这些复杂生态系统中存活，有时甚至实现种群扩张。此外，它们表现出独特且持久的“策略指纹”：Google的Gemini模型在策略上表现出冷酷性，善于利用合作对手并对背叛者实施报复；而OpenAI的模型则保持高度合作性，这一特性在敌对环境中被证明是灾难性的。Anthropic的Claude则成为最具宽容性的互惠者，即使在遭受剥削或成功背叛后，仍表现出显著的合作恢复意愿。通过对模型提供的近32,000条文本推理进行分析，我们发现它们能够主动对时间跨度和对手可能策略进行推理，并证明这种推理对其决策具有关键作用。本研究将经典博弈论与机器心理学相连接，为不确定性下的算法决策提供了丰富而细致的观察视角。

相关内容

MoDELS

关注 44

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日