Autoweka4MCSPS-AVATAR:加速自动机器学习管道构成和优化 (AutoWeka4MCPS-AVATAR: Accelerating Automated Machine Learning Pipeline Composition and Optimisation)

Automated machine learning pipeline (ML) composition and optimisation aim at automating the process of finding the most promising ML pipelines within allocated resources (i.e., time, CPU and memory). Existing methods, such as Bayesian-based and genetic-based optimisation, which are implemented in Auto-Weka, Auto-sklearn and TPOT, evaluate pipelines by executing them. Therefore, the pipeline composition and optimisation of these methods frequently require a tremendous amount of time that prevents them from exploring complex pipelines to find better predictive models. To further explore this research challenge, we have conducted experiments showing that many of the generated pipelines are invalid in the first place, and attempting to execute them is a waste of time and resources. To address this issue, we propose a novel method to evaluate the validity of ML pipelines, without their execution, using a surrogate model (AVATAR). The AVATAR generates a knowledge base by automatically learning the capabilities and effects of ML algorithms on datasets' characteristics. This knowledge base is used for a simplified mapping from an original ML pipeline to a surrogate model which is a Petri net based pipeline. Instead of executing the original ML pipeline to evaluate its validity, the AVATAR evaluates its surrogate model constructed by capabilities and effects of the ML pipeline components and input/output simplified mappings. Evaluating this surrogate model is less resource-intensive than the execution of the original pipeline. As a result, the AVATAR enables the pipeline composition and optimisation methods to evaluate more pipelines by quickly rejecting invalid pipelines. We integrate the AVATAR into the sequential model-based algorithm configuration (SMAC). Our experiments show that when SMAC employs AVATAR, it finds better solutions than on its own.

翻译：自动化机器学习管道(ML)的构成和优化,目的是在分配的资源(即时间、CPU和记忆)内实现找到最有希望的ML管道的自动化进程。现有方法,如Bayesian和基因优化,在Auto-Weka、Auto-sklearn和TPOT中实施,通过执行这些管道来评估管道。因此,管道构成和优化这些方法往往需要大量的时间,使它们无法探索复杂的管道,以找到更好的预测模型。为了进一步探讨这一研究挑战,我们进行了实验,表明许多产生的管道在第一时间(即时间、CPU和记忆)内是无效的,试图执行这些现有方法是浪费时间和资源的浪费。为了解决这一问题,我们提出了一个新的方法来评估ML输油管道的有效性,而不用执行,使用一个surrategategate模型(AVARTAR)来评估管道的构成和优化,通过自动学习MLML的计算结果,我们用这个知识库用来从原始的ML输油管流流到基础的计算结果,而用一个原始的计算模型来简化MVAVA的模型,用来评估其原始的计算结果,而用SVAVA的模型比VA的模型来评估。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【开放书】贝叶斯推理与机器学习，690页pdf，Bayesian Reasoning and Machine Learning

专知会员服务

191+阅读 · 2020年5月30日

史上机器学习 &深度学习课程大合集，一站搞定，Deep Learning Drizzle

专知会员服务

175+阅读 · 2020年5月10日

【综述】联邦学习的威胁，Threats to Federated Learning: A Survey

专知会员服务

80+阅读 · 2020年3月4日

使用TensorFlow建立深度学习模型，563页pdf，Deep Learning Pipeline Building a Deep Learning Model with TensorFlow

专知会员服务

149+阅读 · 2020年1月2日