Recently, Model Predictive Contouring Control (MPCC) has arisen as the state-of-the-art approach for model-based agile flight. MPCC benefits from great flexibility in trading-off between progress maximization and path following at runtime without relying on globally optimized trajectories. However, finding the optimal set of tuning parameters for MPCC is challenging because (i) the full quadrotor dynamics are non-linear, (ii) the cost function is highly non-convex, and (iii) of the high dimensionality of the hyperparameter space. This paper leverages a probabilistic Policy Search method - Weighted Maximum Likelihood (WML)- to automatically learn the optimal objective for MPCC. WML is sample-efficient due to its closed-form solution for updating the learning parameters. Additionally, the data efficiency provided by the use of a model-based approach allows us to directly train in a high-fidelity simulator, which in turn makes our approach able to transfer zero-shot to the real world. We validate our approach in the real world, where we show that our method outperforms both the previous manually tuned controller and the state-of-the-art auto-tuning baseline reaching speeds of 75 km/h.
翻译:最近,模型预测孔径控制(MPCC)作为基于模型的灵活飞行的最先进方法出现。在不依赖全球优化轨迹的情况下,模型预测孔径控制(MPCC)在进展最大化和运行过程中的路径之间的交易中有很大的灵活性,但是,为模型预测孔径控制(MPCC)找到一套最佳调控参数具有挑战性,因为(一) 完全的四重流动态是非线性,(二) 成本功能是高度非电解,以及(三) 超光谱空间的高维度。本文利用了一种概率性的政策搜索方法 -- -- 加权最大相似度(WML) -- -- 自动学习MPCC的最佳目标。由于WML是更新学习参数的封闭式解决方案,其样本效率是有效的。此外,使用模型方法提供的数据效率使我们能够直接在高纤维模拟器中进行训练,这反过来使我们能够将我们的方法向现实世界转移零镜头。我们验证了我们在现实世界中采用的方法 -- -- 加权最大相似度政策搜索方法 -- -- -- 自动学习MPCC的最佳目标。WML是因为它为更新了更新了更新了我们先前的75公里制模制模模模模模模模模模模模模模模模模模版模样。</s>