When robots are deployed in the field for environmental monitoring they typically execute pre-programmed motions, such as lawnmower paths, instead of adaptive methods, such as informative path planning. One reason for this is that adaptive methods are dependent on parameter choices that are both critical to set correctly and difficult for the non-specialist to choose. Here, we show how to automatically configure a planner for informative path planning by training a reinforcement learning agent to select planner parameters at each iteration of informative path planning. We demonstrate our method with 37 instances of 3 distinct environments, and compare it against pure (end-to-end) reinforcement learning techniques, as well as approaches that do not use a learned model to change the planner parameters. Our method shows a 9.53% mean improvement in the cumulative reward across diverse environments when compared to end-to-end learning based methods; we also demonstrate via a field experiment how it can be readily used to facilitate high performance deployment of an information gathering robot.
翻译:当机器人被部署到环境监测的实地时,他们通常会执行预先编程的动作,例如草坪路径,而不是适应性的方法,例如信息化路径规划。原因之一是适应性方法取决于参数选择,而参数选择对于正确设定至关重要,对于非专家来说是难于选择的。这里,我们展示了如何通过训练强化学习代理来自动配置信息化路径规划规划规划的规划员,在信息化路径规划的每次迭代中选择规划者参数。我们展示了我们的方法,有37例3个不同的环境,并对照纯(端到端)强化学习技术,以及不使用学习模型来改变规划者参数的方法。我们的方法显示,与基于端到端学习的方法相比,在不同环境中累积的回报率提高了9.53%;我们还通过实地实验展示了如何方便高性地使用该方法来帮助高性地部署信息采集机器人。</s>