Can deep language models be explanatory models of human cognition? If so, what are their limits? In order to explore this question, we propose an approach called hyperparameter hypothesization that uses predictive hyperparameter tuning in order to find individuating descriptors of cognitive-behavioral profiles. We take the first step in this approach by predicting human performance in the semantic fluency task (SFT), a well-studied task in cognitive science that has never before been modeled using transformer-based language models (TLMs). In our task setup, we compare several approaches to predicting which word an individual performing SFT will utter next. We report preliminary evidence suggesting that, despite obvious implementational differences in how people and TLMs learn and use language, TLMs can be used to identify individual differences in human fluency task behaviors better than existing computational models, and may offer insights into human memory retrieval strategies -- cognitive process not typically considered to be the kinds of things TLMs can model. Finally, we discuss the implications of this work for cognitive modeling of knowledge representations.
翻译:深语言模型能成为人类认知模型的解释性模型吗? 如果是这样的话,它们的极限是什么?为了探讨这一问题,我们建议了一种叫做超参数假设化的方法,它使用预测性超参数调,以找到认知-行为特征的内在描述性标本。我们采取这一方法的第一步是预测人类在精密流力任务(SFT)中的性能,这是一个经过深思熟虑的认知科学任务,以前从未使用过以变压器为基础的语言模型(TLMs)进行模拟。在我们的任务设置中,我们比较了几种方法来预测执行SFT的个人会下一个词。我们报告的初步证据表明,尽管在人和TLMs学习和使用语言方面显然存在执行上的差异,但TLMs可以用来查明人类流力任务行为中的个人差异,比现有的计算模型更好,并且可以提供人类记忆检索战略的洞察力 -- -- 认知过程通常不被认为是TLMs能够模型的模型。最后,我们讨论了这项工作对知识展示的认知性模型的影响。