Modeling movement in real-world tasks is a fundamental goal for motor control, biomechanics, and rehabilitation engineering. However, widely used data-driven models of essential tasks like locomotion make simplifying assumptions such as linear and fixed timescale mappings between past inputs and future actions, which do not generalize to real-world contexts. Here, we develop a deep learning-based framework for action prediction with architecture-dependent trial embeddings, outperforming traditional models across contexts (walking and running, treadmill and overground, varying terrains) and input modalities (multiple body states, gaze). We find that neural network architectures with flexible input history-dependence like GRU and Transformer perform best overall. By quantifying the model's predictions relative to an autoregressive baseline, we identify context- and modality-dependent timescales. These analyses reveal that there is greater reliance on fast-timescale predictions in complex terrain, gaze predicts future foot placement before body states, and the full-body state predictions precede those by center-of-mass-relevant states. This deep learning framework for action prediction provides quantifiable insights into the control of real-world locomotion and can be extended to other actions, contexts, and populations.
翻译:暂无翻译