Deep reinforcement learning (DRL) has been proven to be a powerful paradigm for learning complex control policy autonomously. Numerous recent applications of DRL in robotic grasping have successfully trained DRL robotic agents end-to-end, mapping visual inputs into control instructions directly, but the amount of training data required may hinder these applications in practice. In this paper, we propose a DRL based robotic visual grasping framework, in which visual perception and control policy are trained separately rather than end-to-end. The visual perception produces physical descriptions of grasped objects and the policy takes use of them to decide optimal actions based on DRL. Benefiting from the explicit representation of objects, the policy is expected to be endowed with more generalization power over new objects and environments. In addition, the policy can be trained in simulation and transferred in real robotic system without any further training. We evaluate our framework in a real world robotic system on a number of robotic grasping tasks, such as semantic grasping, clustered object grasping, moving object grasping. The results show impressive robustness and generalization of our system.
翻译:深度强化学习(DRL)已被证明是自主学习复杂控制政策的强大范例。最近许多DRL在机器人捕捉方面的应用成功地培训了DRL机器人剂端对端,将视觉投入绘制成直接控制指令,但所需培训数据的数量可能在实践中阻碍这些应用。在本文件中,我们提议基于DRL的机器人视觉捕捉框架,在这种框架中,视觉感知和控制政策是单独而不是端对端培训的。视觉感知生成了被捕捉对象的物理描述,而政策则利用它们来决定基于DRL的最佳行动。受益于对物体的清晰描述,预计该政策将赋予对新物体和环境的更普遍化的权力。此外,该政策可以在模拟和在实际机器人系统中进行转让,而无需任何进一步的培训。我们可以在一个真正的世界机器人捕捉任务上评估我们的框架,例如语义抓捉、集群物体捕捉、移动物体捕捉等。结果显示我们系统令人印象深刻的稳健和普遍化。