Architecture search is the process of automatically learning the neural model or cell structure that best suits the given task. Recently, this approach has shown promising performance improvements (on language modeling and image classification) with reasonable training speed, using a weight sharing strategy called Efficient Neural Architecture Search (ENAS). In our work, we first introduce a novel continual architecture search (CAS) approach, so as to continually evolve the model parameters during the sequential training of several tasks, without losing performance on previously learned tasks (via block-sparsity and orthogonality constraints), thus enabling life-long learning. Next, we explore a multi-task architecture search (MAS) approach over ENAS for finding a unified, single cell structure that performs well across multiple tasks (via joint controller rewards), and hence allows more generalizable transfer of the cell structure knowledge to an unseen new task. We empirically show the effectiveness of our sequential continual learning and parallel multi-task learning based architecture search approaches on diverse sentence-pair classification tasks (GLUE) and multimodal-generation based video captioning tasks. Further, we present several ablations and analyses on the learned cell structures.
翻译:建筑搜索是自动学习最适合特定任务的神经模型或细胞结构的过程。 最近,这个方法展示了有希望的性能改进(语言模型和图像分类),其培训速度合理,使用了称为高效神经结构搜索(ENAS)的权重共享战略。 在我们的工作中,我们首先引入了新型的连续建筑搜索(CAS)方法,以便在连续培训若干任务的过程中不断演进模型参数,同时不丧失以往学到的任务(通过区块差异和孔径限制)的性能,从而促成终身学习。 其次,我们探索一种多任务结构搜索(MAS)方法,以寻找一种统一的、单一的细胞结构,在多种任务之间运行良好(通过联合控制者奖励),从而允许将细胞结构知识更普遍地传输到一种看不见的新任务中。我们从经验上展示了我们连续的连续学习和平行的多任务学习基于建筑搜索方法(通过分级和分级限制)的实效,从而有利于终身学习。 此外,我们在ENAS系统上探索一种多任务搜索(MAS)方法,以寻找一个统一的、单一的细胞结构搜索方法,以便完成多种任务(通过联合控制者奖赏),并分析学习的细胞结构结构结构结构结构。