Standard imitation learning (IL) methods have achieved considerable success in robotics, yet often rely on the Markov assumption, which falters in long-horizon tasks where history is crucial for resolving perceptual ambiguity. This limitation stems not only from a conceptual gap but also from a fundamental computational barrier: prevailing architectures like Transformers are often constrained by quadratic complexity, rendering the processing of long, high-dimensional observation sequences infeasible. To overcome this dual challenge, we introduce Mamba Temporal Imitation Learning (MTIL). Our approach represents a new paradigm for robotic learning, which we frame as a practical synthesis of World Model and Dynamical System concepts. By leveraging the linear-time recurrent dynamics of State Space Models (SSMs), MTIL learns an implicit, action-oriented world model that efficiently encodes the entire trajectory history into a compressed, evolving state. This allows the policy to be conditioned on a comprehensive temporal context, transcending the confines of Markovian approaches. Through extensive experiments on simulated benchmarks (ACT, Robomimic, LIBERO) and on challenging real-world tasks, MTIL demonstrates superior performance against SOTA methods like ACT and Diffusion Policy, particularly in resolving long-term temporal ambiguities. Our findings not only affirm the necessity of full temporal context but also validate MTIL as a powerful and a computationally feasible approach for learning long-horizon, non-Markovian behaviors from high-dimensional observations.
翻译:标准模仿学习方法在机器人领域已取得显著成功,但通常依赖于马尔可夫假设,这在需要历史信息来解决感知模糊性的长时程任务中往往失效。这一局限不仅源于概念层面的差距,更存在根本性的计算障碍:主流架构(如Transformer)常受二次复杂度制约,难以处理长序列的高维观测数据。为克服这一双重挑战,我们提出了Mamba时序模仿学习方法。该方法代表了机器人学习的新范式,我们将其构建为世界模型与动态系统概念的实际综合。通过利用状态空间模型的线性时间循环动态特性,MTIL学习到一个隐式的、面向行动的世界模型,能够将完整轨迹历史高效编码为压缩的演化状态。这使得策略能够基于全面的时序上下文进行决策,从而超越马尔可夫方法的局限。通过在仿真基准测试和具挑战性的现实任务上的大量实验,MTIL相较于ACT、Diffusion Policy等先进方法展现出更优的性能,尤其在解决长期时序模糊性方面表现突出。我们的研究不仅证实了全时序上下文的必要性,也验证了MTIL作为一种强大且计算可行的方案,能够从高维观测中学习长时程非马尔可夫行为。