This research focuses on using large language models (LLMs) to simulate social experiments, exploring their ability to emulate human personality in virtual persona role-playing. The research develops an end-to-end evaluation framework, including individual-level analysis of stability and identifiability, as well as population-level analysis called progressive personality curves to examine the veracity and consistency of LLMs in simulating human personality. Methodologically, this research proposes important modifications to traditional psychometric approaches (CFA and construct validity) which are unable to capture improvement trends in LLMs at their current low-level simulation, potentially leading to remature rejection or methodological misalignment. The main contributions of this research are: proposing a systematic framework for LLM virtual personality evaluation; empirically demonstrating the critical role of persona detail in personality simulation quality; and identifying marginal utility effects of persona profiles, especially a Scaling Law in LLM personality simulation, offering operational evaluation metrics and a theoretical foundation for applying large language models in social science experiments.
翻译:本研究聚焦于利用大语言模型(LLMs)进行社会实验模拟,探索其在虚拟角色扮演中模拟人类人格的能力。研究构建了一个端到端的评估框架,包括针对稳定性与可识别性的个体层面分析,以及称为渐进人格曲线的人口层面分析,以检验LLMs在模拟人类人格时的真实性与一致性。在方法论上,本研究对传统心理测量方法(验证性因子分析与构念效度)提出了重要修正,这些方法在LLMs当前较低水平的模拟能力下无法捕捉其改进趋势,可能导致过早否定或方法错位。本研究的主要贡献包括:提出了一个系统性的LLM虚拟人格评估框架;通过实证研究论证了人物画像细节在人格模拟质量中的关键作用;识别了人物画像的边际效用效应,特别是LLM人格模拟中存在的缩放定律,为在社会科学实验中应用大语言模型提供了可操作的评估指标与理论基础。