We introduce a new neural network-based continual learning algorithm, dubbed as Uncertainty-regularized Continual Learning (UCL), which builds on traditional Bayesian online learning framework with variational inference. We focus on two significant drawbacks of the recently proposed regularization-based methods: a) considerable additional memory cost for determining the per-weight regularization strengths and b) the absence of gracefully forgetting scheme, which can prevent performance degradation in learning new tasks. In this paper, we show UCL can solve these two problems by introducing a fresh interpretation on the Kullback-Leibler (KL) divergence term of the variational lower bound for Gaussian mean-field approximation. Based on the interpretation, we propose the notion of node-wise uncertainty, which drastically reduces the number of additional parameters for implementing per-weight regularization. Moreover, we devise two additional regularization terms that enforce stability by freezing important parameters for past tasks and allow plasticity by controlling the actively learning parameters for a new task. Through extensive experiments, we show UCL convincingly outperforms most of recent state-of-the-art baselines not only on popular supervised learning benchmarks, but also on challenging lifelong reinforcement learning tasks. The source code of our algorithm is available at https://github.com/csm9493/UCL.
翻译:我们引入了一个新的以神经网络为基础的以神经网络为基础的持续学习算法,称为 " 不确定性-常规性持续学习(UCL) ",它以传统的巴伊西亚在线学习框架为基础,以变相推推推推论为基础。我们侧重于最近提议的基于正规化方法的两个重大缺点:(a) 确定人均重量规范化优势需要大量额外的记忆成本,以及(b) 没有优雅的遗忘计划,这可以在学习新任务时防止性能退化。在本文中,UCL展示出UCL能够解决这两个问题,方法是对高山平均场近似值的变异性低约束的 Kullback-Leiber (KL) 差异术语进行新的解释。我们根据这些解释提出了 " 低度不确定性 " 的概念,这大大降低了执行人均重量规范化优势的附加参数的数量。此外,我们又制定了另外两个规范化条款,通过冻结以往任务的重要参数,并通过控制一项新任务的积极学习参数,使塑料化。我们通过广泛的实验,向UCL展示了对最新状态-艺术基准线最优于可加扎的基线的G93年均受监督的学习基准。