最终至最终数据分析分析的多层优化 (Multi-layer Optimizations for End-to-End Data Analytics) - 专知论文

会员服务 ·

0

特征提取 · Machine Learning · Jupyter · 优化器 · MoDELS ·

2020 年 1 月 10 日

Multi-layer Optimizations for End-to-End Data Analytics

翻译：最终至最终数据分析分析的多层优化

Amir Shaikhha,Maximilian Schleich,Alexandru Ghita,Dan Olteanu

We consider the problem of training machine learning models over multi-relational data. The mainstream approach is to first construct the training dataset using a feature extraction query over input database and then use a statistical software package of choice to train the model. In this paper we introduce Iterative Functional Aggregate Queries (IFAQ), a framework that realizes an alternative approach. IFAQ treats the feature extraction query and the learning task as one program given in the IFAQ's domain-specific language, which captures a subset of Python commonly used in Jupyter notebooks for rapid prototyping of machine learning applications. The program is subject to several layers of IFAQ optimizations, such as algebraic transformations, loop transformations, schema specialization, data layout optimizations, and finally compilation into efficient low-level C++ code specialized for the given workload and data. We show that a Scala implementation of IFAQ can outperform mlpack, Scikit, and TensorFlow by several orders of magnitude for linear regression and regression tree models over several relational datasets.

翻译：我们考虑在多关系数据方面培训机器学习模型的问题。主流方法是首先使用输入数据库的特征提取查询来构建培训数据集,然后使用选择的统计软件包来培训模型。在本文中,我们引入了循环功能综合查询(IFAQ ), 这是一个实现替代方法的框架。 IFAQ 将特性提取查询和学习任务作为IFAQ 特定域语言中给出的一个程序处理,它捕捉了在Jupyter笔记本中常用的一套Python, 用于机器学习应用的快速原型。这个程序受到IFAQ 数层优化的制约, 如代数个关系数据集的代谢变换、循环变换、 Schema 专门化、数据布局优化, 并最终编成高效的低水平 C++代码, 专门用于给定工作量和数据。我们显示, IFAQ 的Scala 执行Scala 能够超越模版的 milpack、 Skitt 和 TensorFlow 。

0

相关内容

特征提取

特征提取是计算机视觉和图像处理中的一个概念。它指的是使用计算机提取图像信息，决定每个图像的点是否属于一个图像特征。特征被检测后它可以从图像中被抽取出来。这个过程可能需要许多图像处理的计算机。其结果被称为特征描述或者特征向量。

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

88+阅读 · 2020年5月6日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

238+阅读 · 2020年4月19日

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

专知会员服务

113+阅读 · 2020年3月25日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

75+阅读 · 2020年2月8日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

112+阅读 · 2019年12月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

234+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

143+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

35+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

90+阅读 · 2019年10月10日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

A Comparison of Neural Network Training Methods for Text Classification

Arxiv

6+阅读 · 2019年10月28日

Text Level Graph Neural Network for Text Classification

Text Level Graph Neural Network for Text Classification

Arxiv

8+阅读 · 2019年10月6日

Semi-Supervised Graph Embedding for Multi-Label Graph Node Classification

Semi-Supervised Graph Embedding for Multi-Label Graph Node Classification

Arxiv

5+阅读 · 2019年7月12日

Text Classification Algorithms: A Survey

Arxiv

5+阅读 · 2019年4月25日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

End-to-End Learning for Answering Structured Queries Directly over Text

Arxiv

3+阅读 · 2018年11月16日

Speeding-up Object Detection Training for Robotics with FALKON

Speeding-up Object Detection Training for Robotics with FALKON

Arxiv

6+阅读 · 2018年8月27日

Directional Statistics-based Deep Metric Learning for Image Classification and Retrieval

Arxiv

6+阅读 · 2018年3月28日

VIP会员

文章信息

相关主题

Machine Learning

相关VIP内容

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

【斯坦福】机器学习优化简明导论， Introduction to Optimization for Machine Learning

专知会员服务

88+阅读 · 2020年5月6日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

238+阅读 · 2020年4月19日

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

【机器学习最优化课程笔记】Optimization for Machine Learning，36页pdf

专知会员服务

113+阅读 · 2020年3月25日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

75+阅读 · 2020年2月8日

【金融机器学习课程资料】Financial Machine Learning

专知会员服务

112+阅读 · 2019年12月24日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

234+阅读 · 2019年10月21日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

143+阅读 · 2019年10月12日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

35+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

90+阅读 · 2019年10月10日

热门VIP内容

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【推荐】深度学习目标检测全面综述

【推荐】深度学习目标检测全面综述

机器学习研究会

21+阅读 · 2017年9月13日

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

【推荐】Python机器学习生态圈(Scikit-Learn相关项目)

机器学习研究会

6+阅读 · 2017年8月23日

相关论文

Spectral Clustering with Graph Neural Networks for Graph Pooling

Arxiv

25+阅读 · 2020年6月3日

Multi-Scale Self-Attention for Text Classification

Arxiv

4+阅读 · 2019年12月2日

A Comparison of Neural Network Training Methods for Text Classification

Arxiv

6+阅读 · 2019年10月28日

Text Level Graph Neural Network for Text Classification

Text Level Graph Neural Network for Text Classification

Arxiv

8+阅读 · 2019年10月6日

Semi-Supervised Graph Embedding for Multi-Label Graph Node Classification

Semi-Supervised Graph Embedding for Multi-Label Graph Node Classification

Arxiv

5+阅读 · 2019年7月12日

Text Classification Algorithms: A Survey

Arxiv

5+阅读 · 2019年4月25日

Optimization Models for Machine Learning: A Survey

Arxiv

18+阅读 · 2019年1月16日

End-to-End Learning for Answering Structured Queries Directly over Text

Arxiv

3+阅读 · 2018年11月16日

Speeding-up Object Detection Training for Robotics with FALKON

Speeding-up Object Detection Training for Robotics with FALKON

Arxiv

6+阅读 · 2018年8月27日

Directional Statistics-based Deep Metric Learning for Image Classification and Retrieval

Arxiv

6+阅读 · 2018年3月28日

微信扫码咨询专知VIP会员