Humanoid soccer poses a representative challenge for embodied intelligence, requiring robots to operate within a tightly coupled perception-action loop. However, existing systems typically rely on decoupled modules, resulting in delayed responses and incoherent behaviors in dynamic environments, while real-world perceptual limitations further exacerbate these issues. In this work, we present a unified reinforcement learning-based controller that enables humanoid robots to acquire reactive soccer skills through the direct integration of visual perception and motion control. Our approach extends Adversarial Motion Priors to perceptual settings in real-world dynamic environments, bridging motion imitation and visually grounded dynamic control. We introduce an encoder-decoder architecture combined with a virtual perception system that models real-world visual characteristics, allowing the policy to recover privileged states from imperfect observations and establish active coordination between perception and action. The resulting controller demonstrates strong reactivity, consistently executing coherent and robust soccer behaviors across various scenarios, including real RoboCup matches.
翻译:人形足球运动对具身智能提出了代表性挑战,要求机器人在紧密耦合的感知-行动循环中运行。然而,现有系统通常依赖解耦模块,导致动态环境中响应延迟与行为失协,而现实世界的感知局限进一步加剧了这些问题。本研究提出一种基于强化学习的统一控制器,通过视觉感知与运动控制的直接集成,使人形机器人能够习得反应式足球技能。我们的方法将对抗运动先验扩展至现实动态环境中的感知场景,衔接了运动模仿与视觉锚定的动态控制。我们引入结合虚拟感知系统的编码器-解码器架构,该系统建模现实视觉特性,使策略能够从不完美观测中恢复特权状态,并建立感知与行动间的主动协调。所得控制器展现出强反应性,在包括真实RoboCup比赛在内的多种场景中,持续执行协调且鲁棒的足球行为。