Self-supervised goal-conditioned reinforcement learning enables robots to autonomously acquire diverse skills without human supervision. However, a central challenge is the goal setting problem: robots must propose feasible and diverse goals that are achievable in their current environment. Existing methods like RIG (Visual Reinforcement Learning with Imagined Goals) use variational autoencoder (VAE) to generate goals in a learned latent space but have the limitation of producing physically implausible goals that hinder learning efficiency. We propose Physics-Informed RIG (PI-RIG), which integrates physical constraints directly into the VAE training process through a novel Enhanced Physics-Informed Variational Autoencoder (Enhanced p3-VAE), enabling the generation of physically consistent and achievable goals. Our key innovation is the explicit separation of the latent space into physics variables governing object dynamics and environmental factors capturing visual appearance, while enforcing physical consistency through differential equation constraints and conservation laws. This enables the generation of physically consistent and achievable goals that respect fundamental physical principles such as object permanence, collision constraints, and dynamic feasibility. Through extensive experiments, we demonstrate that this physics-informed goal generation significantly improves the quality of proposed goals, leading to more effective exploration and better skill acquisition in visual robotic manipulation tasks including reaching, pushing, and pick-and-place scenarios.
翻译:自监督目标条件强化学习使机器人能够在无需人工监督的情况下自主获取多样化技能。然而,核心挑战在于目标设定问题:机器人必须提出在当前环境中可行且多样化的可达成目标。现有方法如RIG(基于想象目标的视觉强化学习)使用变分自编码器(VAE)在学习的潜在空间中生成目标,但存在产生物理上不可行目标从而阻碍学习效率的局限性。我们提出物理信息RIG(PI-RIG),通过一种新颖的增强型物理信息变分自编码器(Enhanced p3-VAE)将物理约束直接整合到VAE训练过程中,从而生成物理一致且可实现的目标。我们的关键创新在于将潜在空间显式分离为控制物体动力学的物理变量和捕捉视觉外观的环境因素,同时通过微分方程约束和守恒定律强制实施物理一致性。这使得生成的目标能够遵循基本物理原理,如物体恒存性、碰撞约束和动态可行性。通过大量实验,我们证明这种物理信息目标生成显著提升了所提出目标的质量,在包括抓取、推动和抓放场景的视觉机器人操作任务中,实现了更有效的探索和更优的技能获取。