Adaptive control approaches yield high-performance controllers when a precise system model or suitable parametrizations of the controller are available. Existing data-driven approaches for adaptive control mostly augment standard model-based methods with additional information about uncertainties in the dynamics or about disturbances. In this work, we propose a purely data-driven, model-free approach for adaptive control. Tuning low-level controllers based solely on system data raises concerns on the underlying algorithm safety and computational performance. Thus, our approach builds on GoOSE, an algorithm for safe and sample-efficient Bayesian optimization. We introduce several computational and algorithmic modifications in GoOSE that enable its practical use on a rotational motion system. We numerically demonstrate for several types of disturbances that our approach is sample efficient, outperforms constrained Bayesian optimization in terms of safety, and achieves the performance optima computed by grid evaluation. We further demonstrate the proposed adaptive control approach experimentally on a rotational motion system.

0
下载
关闭预览

相关内容

最优化是应用数学的一个分支,主要指在一定条件限制下,选取某种研究方案使目标达到最优的一种方法。最优化问题在当今的军事、工程、管理等领域有着极其广泛的应用。

Estimating time-varying graphical models are of paramount importance in various social, financial, biological, and engineering systems, since the evolution of such networks can be utilized for example to spot trends, detect anomalies, predict vulnerability, and evaluate the impact of interventions. Existing methods require extensive tuning of parameters that control the graph sparsity and temporal smoothness. Furthermore, these methods are computationally burdensome with time complexity O(NP^3) for P variables and N time points. As a remedy, we propose a low-complexity tuning-free Bayesian approach, named BADGE. Specifically, we impose temporally-dependent spike-and-slab priors on the graphs such that they are sparse and varying smoothly across time. A variational inference algorithm is then derived to learn the graph structures from the data automatically. Owning to the pseudo-likelihood and the mean-field approximation, the time complexity of BADGE is only O(NP^2). Additionally, by identifying the frequency-domain resemblance to the time-varying graphical models, we show that BADGE can be extended to learning frequency-varying inverse spectral density matrices, and yields graphical models for multivariate stationary time series. Numerical results on both synthetic and real data show that that BADGE can better recover the underlying true graphs, while being more efficient than the existing methods, especially for high-dimensional cases.

0
0
下载
预览

We consider the problem of chance constrained optimization where it is sought to optimize a function and satisfy constraints, both of which are affected by uncertainties. The real world declinations of this problem are particularly challenging because of their inherent computational cost. To tackle such problems, we propose a new Bayesian optimization method. It applies to the situation where the uncertainty comes from some of the inputs, so that it becomes possible to define an acquisition criterion in the joint controlled-uncontrolled input space. The main contribution of this work is an acquisition criterion that accounts for both the average improvement in objective function and the constraint reliability. The criterion is derived following the Stepwise Uncertainty Reduction logic and its maximization provides both optimal controlled and uncontrolled parameters. Analytical expressions are given to efficiently calculate the criterion. Numerical studies on test functions are presented. It is found through experimental comparisons with alternative sampling criteria that the adequation between the sampling criterion and the problem contributes to the efficiency of the overall optimization. As a side result, an expression for the variance of the improvement is given.

0
0
下载
预览

In this paper, we aim to improve the robustness of dynamic quadrupedal locomotion through two aspects: 1) fast model predictive foothold planning, and 2) applying LQR to projected inverse dynamic control for robust motion tracking. In our proposed planning and control framework, foothold plans are updated at 400 Hz considering the current robot state and an LQR controller generates optimal feedback gains for motion tracking. The LQR optimal gain matrix with non-zero off-diagonal elements leverages the coupling of dynamics to compensate for system underactuation. Meanwhile, the projected inverse dynamic control complements the LQR to satisfy inequality constraints. In addition to these contributions, we show robustness of our control framework to unmodeled adaptive feet. Experiments on the quadruped ANYmal demonstrate the effectiveness of the proposed method for robust dynamic locomotion given external disturbances and environmental uncertainties.

0
0
下载
预览

We design a simple reinforcement learning agent that, with a specification only of suitable internal state dynamics and a reward function, can operate with some degree of competence in any environment. The agent maintains visitation counts and value estimates for associated state-action pair. The value function is updated incrementally in response to temporal differences and optimistic boosts that encourage exploration. The agent executes actions that are greedy with respect to this value function. We establish a regret bound demonstrating convergence to near-optimal per-period performance, where the time taken to achieve near-optimality is polynomial in the number of internal states and actions, as well as the reward averaging time of the best policy within the reference policy class, which is comprised of those that depend on history only through the agent's internal state. Notably, there is no further dependence on the number of environment states or mixing times associated with other policies or statistics of history. Our result sheds light on the potential benefits of (deep) representation learning, which has demonstrated the capability to extract compact and relevant features from high-dimensional interaction histories.

0
0
下载
预览

Demand for fast and economical parcel deliveries in urban environments has risen considerably in recent years. A framework envisions efficient last-mile delivery in urban environments by leveraging a network of ride-sharing vehicles, where Unmanned Aerial Systems (UASs) drop packages on said vehicles, which then cover the majority of the distance before final aerial delivery. Notably, we consider the problem of planning a rendezvous path for the UAS to reach a human driver, who may choose between N possible paths and has uncertain behavior, while meeting strict safety constraints. The long planning horizon and safety constraints require robust heuristics that combine learning and optimal control using Gaussian Process Regression, sampling-based optimization, and Model Predictive Control. The resulting algorithm is computationally efficient and shown to be effective in a variety of qualitative scenarios.

0
0
下载
预览

We accelerate deep reinforcement learning-based training in visually complex 3D environments by two orders of magnitude over prior work, realizing end-to-end training speeds of over 19,000 frames of experience per second on a single GPU and up to 72,000 frames per second on a single eight-GPU machine. The key idea of our approach is to design a 3D renderer and embodied navigation simulator around the principle of "batch simulation": accepting and executing large batches of requests simultaneously. Beyond exposing large amounts of work at once, batch simulation allows implementations to amortize in-memory storage of scene assets, rendering work, data loading, and synchronization costs across many simulation requests, dramatically improving the number of simulated agents per GPU and overall simulation throughput. To balance DNN inference and training costs with faster simulation, we also build a computationally efficient policy DNN that maintains high task performance, and modify training algorithms to maintain sample efficiency when training with large mini-batches. By combining batch simulation and DNN performance optimizations, we demonstrate that PointGoal navigation agents can be trained in complex 3D environments on a single GPU in 1.5 days to 97% of the accuracy of agents trained on a prior state-of-the-art system using a 64-GPU cluster over three days. We provide open-source reference implementations of our batch 3D renderer and simulator to facilitate incorporation of these ideas into RL systems.

0
0
下载
预览

Bayesian inference over the reward presents an ideal solution to the ill-posed nature of the inverse reinforcement learning problem. Unfortunately current methods generally do not scale well beyond the small tabular setting due to the need for an inner-loop MDP solver, and even non-Bayesian methods that do themselves scale often require extensive interaction with the environment to perform well, being inappropriate for high stakes or costly applications such as healthcare. In this paper we introduce our method, Approximate Variational Reward Imitation Learning (AVRIL), that addresses both of these issues by jointly learning an approximate posterior distribution over the reward that scales to arbitrarily complicated state spaces alongside an appropriate policy in a completely offline manner through a variational approach to said latent reward. Applying our method to real medical data alongside classic control simulations, we demonstrate Bayesian reward inference in environments beyond the scope of current methods, as well as task performance competitive with focused offline imitation learning algorithms.

0
0
下载
预览

Because of continuous advances in mathematical programing, Mix Integer Optimization has become a competitive vis-a-vis popular regularization method for selecting features in regression problems. The approach exhibits unquestionable foundational appeal and versatility, but also poses important challenges. We tackle these challenges, reducing computational burden when tuning the sparsity bound (a parameter which is critical for effectiveness) and improving performance in the presence of feature collinearity and of signals that vary in nature and strength. Importantly, we render the approach efficient and effective in applications of realistic size and complexity - without resorting to relaxations or heuristics in the optimization, or abandoning rigorous cross-validation tuning. Computational viability and improved performance in subtler scenarios is achieved with a multi-pronged blueprint, leveraging characteristics of the Mixed Integer Programming framework and by means of whitening, a data pre-processing step.

0
5
下载
预览

This paper presents a safety-aware learning framework that employs an adaptive model learning method together with barrier certificates for systems with possibly nonstationary agent dynamics. To extract the dynamic structure of the model, we use a sparse optimization technique, and the resulting model will be used in combination with control barrier certificates which constrain feedback controllers only when safety is about to be violated. Under some mild assumptions, solutions to the constrained feedback-controller optimization are guaranteed to be globally optimal, and the monotonic improvement of a feedback controller is thus ensured. In addition, we reformulate the (action-)value function approximation to make any kernel-based nonlinear function estimation method applicable. We then employ a state-of-the-art kernel adaptive filtering technique for the (action-)value function approximation. The resulting framework is verified experimentally on a brushbot, whose dynamics is unknown and highly complex.

0
4
下载
预览

Current convolutional neural networks algorithms for video object tracking spend the same amount of computation for each object and video frame. However, it is harder to track an object in some frames than others, due to the varying amount of clutter, scene complexity, amount of motion, and object's distinctiveness against its background. We propose a depth-adaptive convolutional Siamese network that performs video tracking adaptively at multiple neural network depths. Parametric gating functions are trained to control the depth of the convolutional feature extractor by minimizing a joint loss of computational cost and tracking error. Our network achieves accuracy comparable to the state-of-the-art on the VOT2016 benchmark. Furthermore, our adaptive depth computation achieves higher accuracy for a given computational cost than traditional fixed-structure neural networks. The presented framework extends to other tasks that use convolutional neural networks and enables trading speed for accuracy at runtime.

0
8
下载
预览
小贴士
相关论文
Hang Yu,Songwei Wu,Justin Dauwels
0+阅读 · 3月15日
Reda El Amri,Rodolphe Le Riche,Céline Helbert,Christophette Blanchet-Scalliet,Sébastien Da Veiga
0+阅读 · 3月15日
Guiyang Xin,Songyan Xin,Oguzhan Cebe,Mathew Jose Pollayil,Franco Angelini,Manolo Garabini,Sethu Vijayakumar,Michael Mistry
0+阅读 · 3月13日
Shi Dong,Benjamin Van Roy,Zhengyuan Zhou
0+阅读 · 3月13日
Gabriel Barsi Haberfeld,Aditya Gahlawat,Naira Hovakimyan
0+阅读 · 3月12日
Brennan Shacklett,Erik Wijmans,Aleksei Petrenko,Manolis Savva,Dhruv Batra,Vladlen Koltun,Kayvon Fatahalian
0+阅读 · 3月12日
Alex J. Chan,Mihaela van der Schaar
0+阅读 · 3月11日
Efficient and Effective $L_0$ Feature Selection
Ana Kenney,Francesca Chiaromonte,Giovanni Felici
5+阅读 · 2018年8月7日
Motoya Ohnishi,Li Wang,Gennaro Notomista,Magnus Egerstedt
4+阅读 · 2018年1月29日
Chris Ying,Katerina Fragkiadaki
8+阅读 · 2018年1月1日
相关VIP内容
专知会员服务
37+阅读 · 3月16日
专知会员服务
58+阅读 · 2月19日
专知会员服务
32+阅读 · 2020年8月16日
专知会员服务
86+阅读 · 2020年8月7日
专知会员服务
43+阅读 · 2020年7月26日
Fariz Darari简明《博弈论Game Theory》介绍,35页ppt
专知会员服务
68+阅读 · 2020年5月15日
因果图,Causal Graphs,52页ppt
专知会员服务
152+阅读 · 2020年4月19日
专知会员服务
107+阅读 · 2020年2月1日
强化学习最新教程,17页pdf
专知会员服务
74+阅读 · 2019年10月11日
相关资讯
强化学习的Unsupervised Meta-Learning
CreateAMind
7+阅读 · 2019年1月7日
Unsupervised Learning via Meta-Learning
CreateAMind
32+阅读 · 2019年1月3日
meta learning 17年:MAML SNAIL
CreateAMind
9+阅读 · 2019年1月2日
RL 真经
CreateAMind
4+阅读 · 2018年12月28日
A Technical Overview of AI & ML in 2018 & Trends for 2019
待字闺中
10+阅读 · 2018年12月24日
OpenAI丨深度强化学习关键论文列表
中国人工智能学会
14+阅读 · 2018年11月10日
【OpenAI】深度强化学习关键论文列表
专知
9+阅读 · 2018年11月10日
【学习】Hierarchical Softmax
机器学习研究会
3+阅读 · 2017年8月6日
强化学习族谱
CreateAMind
11+阅读 · 2017年8月2日
Top