We formulate sequential maximum a posteriori inference as a recursion of loss functions and reduce the problem of continual learning to approximating the previous loss function. We then propose two coreset-free methods: autodiff quadratic consolidation, which uses an accurate and full quadratic approximation, and neural consolidation, which uses a neural network approximation. These methods are not scalable with respect to the neural network size, and we study them for classification tasks in combination with a fixed pre-trained feature extractor. We also introduce simple but challenging classical task sequences based on Iris and Wine datasets. We find that neural consolidation performs well in the classical task sequences, where the input dimension is small, while autodiff quadratic consolidation performs consistently well in image task sequences with a fixed pre-trained feature extractor, achieving comparable performance to joint maximum a posteriori training in many cases.
翻译:暂无翻译