用随机散布的噪音从大数据中提取信息 (Extract the information from the big data with randomly distributed noise) - 专知论文

会员服务 ·

0

INFORMS · 正则化项 · 噪声 · 估计/估计量 · 大数据 ·

2021 年 2 月 18 日

Extract the information from the big data with randomly distributed noise

翻译：用随机散布的噪音从大数据中提取信息

Jin Cheng,Jiantang Zhang,Min Zhong

In this manuscript, a purely data driven statistical regularization method is proposed for extracting the information from big data with randomly distributed noise. Since the variance of the noise maybe large, the method can be regarded as a general data preprocessing method in ill-posed problems, which is able to overcome the difficulty that the traditional regularization method unable to solve, and has superior advantage in computing efficiency. The unique solvability of the method is proved and a number of conditions are given to characterize the solution. The regularization parameter strategy is discussed and the rigorous upper bound estimation of confidence interval of the error in $L^2$ norm is established. Some numerical examples are provided to illustrate the appropriateness and effectiveness of the method.

翻译：在本手稿中,建议采用一种纯粹以数据驱动的统计正规化方法,以随机散布噪音的方式从大数据中提取信息;由于噪音的差异可能很大,该方法可被视为是处理不良问题的一般数据预处理方法,能够克服传统正规化方法无法解决的困难,在计算效率方面具有优势;该方法的独特可溶性得到证明,并规定了一些条件来说明解决办法的特点;讨论了正规化参数战略,并确定了以$L$2美元标准计算错误的严格上限估计信任度;提供了一些数字例子,以说明该方法是否适当和有效。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【AAAI2021】信息瓶颈和有监督表征解耦

【AAAI2021】信息瓶颈和有监督表征解耦

专知会员服务

21+阅读 · 2021年1月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

经济学中的数据科学，Data Science in Economics，附22页pdf

经济学中的数据科学，Data Science in Economics，附22页pdf

专知会员服务

36+阅读 · 2020年4月1日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ECML-PKDD 2019】基于种子样本的Web数据抽取（Web Data Extraction with Seed Samples）

【ECML-PKDD 2019】基于种子样本的Web数据抽取（Web Data Extraction with Seed Samples）

专知会员服务

8+阅读 · 2019年12月3日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

已删除

将门创投

5+阅读 · 2019年10月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Multilingual Chart-based Constituency Parse Extraction from Pre-trained Language Models

Arxiv

0+阅读 · 2021年4月12日

Creating Robust Deep Neural Networks With Coded Distributed Computing for IoT Systems

Arxiv

0+阅读 · 2021年4月9日

Localizing differences in smooths with simultaneous confidence bounds on the true discovery proportion

Arxiv

0+阅读 · 2021年4月9日

Ensemble Inference Methods for Models With Noisy and Expensive Likelihoods

Arxiv

0+阅读 · 2021年4月7日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

Products of Euclidean metrics and applications to proximity questions among curves

Arxiv

3+阅读 · 2020年4月13日

Attention Forcing for Sequence-to-sequence Model Training

Attention Forcing for Sequence-to-sequence Model Training

Arxiv

7+阅读 · 2019年9月26日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

On the loss of Fisher information in some multi-object tracking observation models

Arxiv

3+阅读 · 2018年3月26日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

VIP会员

文章信息

相关主题

估计/估计量

相关VIP内容

【AAAI2021】信息瓶颈和有监督表征解耦

【AAAI2021】信息瓶颈和有监督表征解耦

专知会员服务

21+阅读 · 2021年1月27日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

数据科学导论，54页ppt，Introduction to Data Science

数据科学导论，54页ppt，Introduction to Data Science

专知会员服务

42+阅读 · 2020年7月27日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

经济学中的数据科学，Data Science in Economics，附22页pdf

经济学中的数据科学，Data Science in Economics，附22页pdf

专知会员服务

36+阅读 · 2020年4月1日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【ECML-PKDD 2019】基于种子样本的Web数据抽取（Web Data Extraction with Seed Samples）

【ECML-PKDD 2019】基于种子样本的Web数据抽取（Web Data Extraction with Seed Samples）

专知会员服务

8+阅读 · 2019年12月3日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【博士论文】面向可扩展深度神经网络的预测编码：理论与实践

如何快速获取数百万架无人机？

EMNLP 2025 | RTQA：递归思想求解复杂的时间知识图谱问答

组合式零样本学习综述

相关资讯

已删除

将门创投

5+阅读 · 2019年10月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

人工智能 | SCI期刊专刊信息3条

人工智能 | SCI期刊专刊信息3条

Call4Papers

5+阅读 · 2019年1月10日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

大数据 | 顶级SCI期刊专刊/国际会议信息7条

大数据 | 顶级SCI期刊专刊/国际会议信息7条

Call4Papers

10+阅读 · 2018年12月29日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

13+阅读 · 2017年12月25日

【推荐】用Tensorflow理解LSTM

【推荐】用Tensorflow理解LSTM

机器学习研究会

36+阅读 · 2017年9月11日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Multilingual Chart-based Constituency Parse Extraction from Pre-trained Language Models

Arxiv

0+阅读 · 2021年4月12日

Creating Robust Deep Neural Networks With Coded Distributed Computing for IoT Systems

Arxiv

0+阅读 · 2021年4月9日

Localizing differences in smooths with simultaneous confidence bounds on the true discovery proportion

Arxiv

0+阅读 · 2021年4月9日

Ensemble Inference Methods for Models With Noisy and Expensive Likelihoods

Arxiv

0+阅读 · 2021年4月7日

Distributed Graph Convolutional Networks

Arxiv

19+阅读 · 2020年7月13日

Products of Euclidean metrics and applications to proximity questions among curves

Arxiv

3+阅读 · 2020年4月13日

Attention Forcing for Sequence-to-sequence Model Training

Attention Forcing for Sequence-to-sequence Model Training

Arxiv

7+阅读 · 2019年9月26日

Testing Matrix Rank, Optimally

Arxiv

3+阅读 · 2018年10月18日

On the loss of Fisher information in some multi-object tracking observation models

Arxiv

3+阅读 · 2018年3月26日

Active Learning from Positive and Unlabeled Data

Arxiv

3+阅读 · 2016年2月24日

微信扫码咨询专知VIP会员