Teaching dexterity to multi-fingered robots has been a longstanding challenge in robotics. Most prominent work in this area focuses on learning controllers or policies that either operate on visual observations or state estimates derived from vision. However, such methods perform poorly on fine-grained manipulation tasks that require reasoning about contact forces or about objects occluded by the hand itself. In this work, we present T-Dex, a new approach for tactile-based dexterity, that operates in two phases. In the first phase, we collect 2.5 hours of play data, which is used to train self-supervised tactile encoders. This is necessary to bring high-dimensional tactile readings to a lower-dimensional embedding. In the second phase, given a handful of demonstrations for a dexterous task, we learn non-parametric policies that combine the tactile observations with visual ones. Across five challenging dexterous tasks, we show that our tactile-based dexterity models outperform purely vision and torque-based models by an average of 1.7X. Finally, we provide a detailed analysis on factors critical to T-Dex including the importance of play data, architectures, and representation learning.
翻译:----
教授多指机器人灵巧性一直是机器人学领域的一项长期挑战。该领域中最杰出的工作集中在学习控制器或策略,这些控制器或策略要么针对视觉观察结果,要么针对从视觉中得出的状态估计值。然而,这样的方法在需要推理接触力或推理被手本身遮挡的物体的细致操作任务上表现不佳。在这项工作中,我们提出了T-Dex,这是一种新的基于触觉的机动技巧方法,它有两个阶段。第一阶段,我们收集了2.5小时的游戏数据,用于训练自监督的触觉编码器。这是将高维触觉读数降维至较低维度的必要步骤。第二阶段,给定一些灵巧任务的演示,我们通过将触觉观察结果和视觉观察结果相结合,学习非参数策略。在五项具有挑战性的灵巧任务中,我们发现,与纯视觉和扭矩模型相比,我们的基于触觉的灵巧模型的性能平均提高了1.7倍。最后,我们对T-Dex的关键要素进行了详细的分析,包括游戏数据、架构和表示学习的重要性。