使用近近内内内内内内嵌嵌入式分发课程的线性时间学习 (Linear-time Learning on Distributions with Approximate Kernel Embeddings) - 专知论文

会员服务 ·

0

核化 · Gram 矩阵 · 估计/估计量 · 近似 · 学成 ·

2021 年 1 月 14 日

Linear-time Learning on Distributions with Approximate Kernel Embeddings

翻译：使用近近内内内内内内嵌嵌入式分发课程的线性时间学习

Danica J. Sutherland,Junier B. Oliva,Barnabás Póczos,Jeff Schneider

Many interesting machine learning problems are best posed by considering instances that are distributions, or sample sets drawn from distributions. Previous work devoted to machine learning tasks with distributional inputs has done so through pairwise kernel evaluations between pdfs (or sample sets). While such an approach is fine for smaller datasets, the computation of an $N \times N$ Gram matrix is prohibitive in large datasets. Recent scalable estimators that work over pdfs have done so only with kernels that use Euclidean metrics, like the $L_2$ distance. However, there are a myriad of other useful metrics available, such as total variation, Hellinger distance, and the Jensen-Shannon divergence. This work develops the first random features for pdfs whose dot product approximates kernels using these non-Euclidean metrics, allowing estimators using such kernels to scale to large datasets by working in a primal space, without computing large Gram matrices. We provide an analysis of the approximation error in using our proposed random features and show empirically the quality of our approximation both in estimating a Gram matrix and in solving learning tasks in real-world and synthetic data.

翻译：许多有趣的机器学习问题最好通过考虑分布或分布中抽取的样本集的事例来提出。以前专门从事分配投入的机器学习工作的工作是通过对开式(或抽样组)之间的对称内核评估完成的。虽然这种方法对较小的数据集来说是好的,但在大型数据集中计算一个美元=乘以美元=乘以美元=Gram 矩阵是令人望而却步的。最近的可缩放估量器只在使用使用Euclidean 度量器的内核(如:$L_2美元=距离)的情况下才这样做。然而,还有其他许多有用的指标,如全变、Hellinger 距离和Jensen-hannon差异。这项工作为那些其点产品接近非欧元度量度的内核的pdf开发了第一个随机特性。在使用这些非欧元度量度仪表时,使用这种内核的测器的测算器通过在原始空间工作,不计算大格模矩阵等,来测量大型数据集的大小。我们用真实的随机特性和模拟模型分析在模拟中进行真实的模拟和模拟分析时的模拟数据质量。

0

相关内容

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

专知会员服务

124+阅读 · 2019年12月23日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

已删除

将门创投

8+阅读 · 2019年6月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

A prior-based approximate latent Riemannian metric

Arxiv

0+阅读 · 2021年3月9日

Approximate Bayesian inference and forecasting in huge-dimensional multi-country VARs

Approximate Bayesian inference and forecasting in huge-dimensional multi-country VARs

Arxiv

0+阅读 · 2021年3月8日

Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors

Arxiv

0+阅读 · 2021年3月7日

Robust covariance estimation for distributed principal component analysis

Arxiv

0+阅读 · 2021年3月7日

Robust Mean Estimation on Highly Incomplete Data with Arbitrary Outliers

Arxiv

0+阅读 · 2021年3月6日

$γ$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a Robust Divergence Estimator

Arxiv

0+阅读 · 2021年3月5日

Approximation Ratios of Graph Neural Networks for Combinatorial Problems

Arxiv

7+阅读 · 2019年5月24日

Feasibility Based Large Margin Nearest Neighbor Metric Learning

Arxiv

3+阅读 · 2018年5月2日

Learning Role-based Graph Embeddings

Arxiv

3+阅读 · 2018年2月7日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

因果图，Causal Graphs，52页ppt

因果图，Causal Graphs，52页ppt

专知会员服务

252+阅读 · 2020年4月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

【文献综述】分布式机器学习综述论文，33页pdf，A Survey on Distributed Machine Learning

专知会员服务

124+阅读 · 2019年12月23日

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

【变分推断课件】Lectures on Variational Inference： Approximate Bayesian Inference in Machine Learning（附带pdf）

专知会员服务

35+阅读 · 2019年11月30日

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

【机器学习基础最新版】（Mathematics for Machine Learning），417页pdf

专知会员服务

244+阅读 · 2019年10月21日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

俄乌战争启示：坦克战与不断演变的战斗形态

《大规模作战行动中与无人机集成的C5ISR系统》

《主观概率约束下寻找可行系统及其军事应用》69页

《美政府问责局：多种挑战影响地面战车任务出勤率》2025最新130页

相关资讯

局部学习的特征选择：Local-Learning-Based Feature Selection

局部学习的特征选择：Local-Learning-Based Feature Selection

我爱读PAMI

14+阅读 · 2019年9月20日

已删除

将门创投

8+阅读 · 2019年6月13日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

RL 真经

CreateAMind

5+阅读 · 2018年12月28日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Disentangled Representations

Hierarchical Disentangled Representations

CreateAMind

4+阅读 · 2018年4月15日

分布式TensorFlow入门指南

分布式TensorFlow入门指南

机器学习研究会

4+阅读 · 2017年11月28日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

相关论文

A prior-based approximate latent Riemannian metric

Arxiv

0+阅读 · 2021年3月9日

Approximate Bayesian inference and forecasting in huge-dimensional multi-country VARs

Approximate Bayesian inference and forecasting in huge-dimensional multi-country VARs

Arxiv

0+阅读 · 2021年3月8日

Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors

Arxiv

0+阅读 · 2021年3月7日

Robust covariance estimation for distributed principal component analysis

Arxiv

0+阅读 · 2021年3月7日

Robust Mean Estimation on Highly Incomplete Data with Arbitrary Outliers

Arxiv

0+阅读 · 2021年3月6日

$γ$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a Robust Divergence Estimator

Arxiv

0+阅读 · 2021年3月5日

Approximation Ratios of Graph Neural Networks for Combinatorial Problems

Arxiv

7+阅读 · 2019年5月24日

Feasibility Based Large Margin Nearest Neighbor Metric Learning

Arxiv

3+阅读 · 2018年5月2日

Learning Role-based Graph Embeddings

Arxiv

3+阅读 · 2018年2月7日

Variance-based regularization with convex objectives

Arxiv

5+阅读 · 2017年12月14日

微信扫码咨询专知VIP会员