Human infants acquire language and action gradually through development, achieving strong generalization from minimal experience, whereas large language models require exposure to billions of training tokens. What mechanisms underlie such efficient developmental learning in humans? This study investigates this question through robot simulation experiments in which agents learn to perform actions associated with imperative sentences (e.g., \textit{push red cube}) via curiosity-driven self-exploration. Our approach integrates the active inference framework with reinforcement learning, enabling intrinsically motivated developmental learning. The simulations reveal several key findings: i) Generalization improves markedly as the scale of compositional elements increases. ii) Curiosity combined with motor noise yields substantially better learning than exploration without curiosity. iii) Rote pairing of sentences and actions precedes the emergence of compositional generalization. iv) Simpler, prerequisite-like actions develop earlier than more complex actions that depend on them. v) When exception-handling rules were introduced -- where certain imperative sentences required executing inconsistent actions -- the robots successfully acquired these exceptions through exploration and displayed a U-shaped performance curve characteristic of representational redescription in child language learning. Together, these results suggest that curiosity-driven exploration and active inference provide a powerful account of how intrinsic motivation and hierarchical sensorimotor learning can jointly support scalable compositional generalization and exception handling in both humans and artificial agents.
翻译:人类婴儿通过发育过程逐步习得语言与动作,能够从有限经验中实现强大的泛化能力,而大型语言模型则需要接触数十亿的训练标记。这种高效发育学习的机制是什么?本研究通过机器人仿真实验探讨这一问题:智能体通过好奇心驱动的自我探索,学习执行与祈使句(例如\\textit{push red cube})相关联的动作。我们的方法将主动推理框架与强化学习相结合,实现了内在动机驱动的发育学习。仿真实验揭示了若干关键发现:i) 随着组合元素规模的增加,泛化能力显著提升。ii) 好奇心与运动噪声结合产生的学习效果远优于无好奇心的探索。iii) 句子与动作的机械配对先于组合泛化的出现。iv) 简单、类似先决条件的动作比依赖它们的复杂动作更早发展。v) 当引入异常处理规则(即某些祈使句要求执行不一致动作)时,机器人通过探索成功习得这些异常,并表现出儿童语言学习中表征重述特有的U型性能曲线。综上,这些结果表明好奇心驱动的探索与主动推理为内在动机与分层感觉运动学习如何共同支持人类与人工智能体实现可扩展的组合泛化及异常处理提供了有力的解释。