RNA- seq 计数数据的微薄负二血混合混合模型 (A sparse negative binomial mixture model for clustering RNA-seq count data) - 专知论文

会员服务 ·

0

簇 · 高斯混合（模型） · MoDELS · 高斯混合模型 · 稀疏 ·

2019 年 12 月 5 日

A sparse negative binomial mixture model for clustering RNA-seq count data

翻译：RNA- seq 计数数据的微薄负二血混合混合模型

Tanbin Rahman,Yujia Li,Tianzhou Ma,Lu Tang,George Tseng

Clustering with variable selection is a challenging but critical task for modern small-n-large-p data. Existing methods based on Gaussian mixture models or sparse K-means provide solutions to continuous data. With the prevalence of RNA-seq technology and lack of count data modeling for clustering, the current practice is to normalize count expression data into continuous measures and apply existing models with Gaussian assumption. In this paper, we develop a negative binomial mixture model with gene regularization to cluster samples (small $n$) with high-dimensional gene features (large $p$). EM algorithm and Bayesian information criterion are used for inference and determining tuning parameters. The method is compared with sparse Gaussian mixture model and sparse K-means using extensive simulations and two real transcriptomic applications in breast cancer and rat brain studies. The result shows superior performance of the proposed count data model in clustering accuracy, feature selection and biological interpretation by pathway enrichment analysis.

翻译：现有基于高斯混合模型或稀疏的K手段的方法为连续数据提供了解决办法。随着RNA-Seq技术的普及和集群缺乏计数数据模型的缺乏,目前的做法是将计数表达数据标准化为连续措施,并应用高斯假设的现有模型。在本文件中,我们开发了一个负二进制混合模型,将基因正规化为具有高维基因特征的集束样本(小美元)的基因(大额美元)。EM算法和贝叶斯信息标准用于推断和确定调试参数。该方法与稀有高斯混合模型和稀有K手段进行比较,使用广泛的模拟和乳腺癌和老鼠大脑研究中两种真正的超脱血组应用。结果显示,拟议的计数数据模型在集精度、特征选择和通过电路浓缩分析进行生物解释方面表现优异。

0

相关内容

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

119+阅读 · 2020年5月30日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

80+阅读 · 2020年2月27日

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

专知会员服务

130+阅读 · 2020年2月25日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

86+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

158+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

90+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

98+阅读 · 2019年10月9日

【IJCAI 2019 Tutorials】基于概率图模型的医疗决策分析（Medical decision analysis with probabilistic graphical models）

【IJCAI 2019 Tutorials】基于概率图模型的医疗决策分析（Medical decision analysis with probabilistic graphical models）

专知会员服务

43+阅读 · 2019年8月10日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

14+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

25+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

12+阅读 · 2017年12月25日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

14+阅读 · 2020年2月25日

Efficient Parameter-free Clustering Using First Neighbor Relations

Efficient Parameter-free Clustering Using First Neighbor Relations

Arxiv

7+阅读 · 2019年2月28日

ClusterGAN : Latent Space Clustering in Generative Adversarial Networks

ClusterGAN : Latent Space Clustering in Generative Adversarial Networks

Arxiv

9+阅读 · 2018年9月10日

Sparse and Constrained Attention for Neural Machine Translation

Arxiv

4+阅读 · 2018年5月21日

Generative Model for Heterogeneous Inference

Arxiv

4+阅读 · 2018年4月26日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Negative Binomial Matrix Factorization for Recommender Systems

Arxiv

8+阅读 · 2018年1月5日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

SAR: Semantic Analysis for Recommendation

Arxiv

6+阅读 · 2017年12月2日

Deep CTR Prediction in Display Advertising

Arxiv

4+阅读 · 2016年9月20日

VIP会员

文章信息

相关主题

高斯混合（模型）

高斯混合模型

相关VIP内容

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

(普林斯顿讲义)：高维概率论，326页pdf《Probability in High Dimension》

专知会员服务

119+阅读 · 2020年5月30日

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

【经典书】数据挖掘：理论、算法与示例，347页pdf，Nong Ye，Arizona State University

专知会员服务

80+阅读 · 2020年2月27日

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

【经典书】算法设计与分析，727页pdf，Algorithms Design and Analysis，牛津大学出版社

专知会员服务

130+阅读 · 2020年2月25日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

86+阅读 · 2020年2月12日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

158+阅读 · 2020年1月16日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

31+阅读 · 2019年10月17日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

90+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

98+阅读 · 2019年10月9日

【IJCAI 2019 Tutorials】基于概率图模型的医疗决策分析（Medical decision analysis with probabilistic graphical models）

【IJCAI 2019 Tutorials】基于概率图模型的医疗决策分析（Medical decision analysis with probabilistic graphical models）

专知会员服务

43+阅读 · 2019年8月10日

热门VIP内容

相关资讯

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

25+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

14+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

25+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

【SIGIR2018】五篇对抗训练文章

【SIGIR2018】五篇对抗训练文章

专知

12+阅读 · 2018年7月9日

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

【推荐】(Python)多种模型(Naive Bayes, SVM, CNN, LSTM, etc)实现推文情感分析

机器学习研究会

12+阅读 · 2017年12月25日

Adversarial Variational Bayes: Unifying VAE and GAN 代码

Adversarial Variational Bayes: Unifying VAE and GAN 代码

CreateAMind

7+阅读 · 2017年10月4日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

On Feature Normalization and Data Augmentation

On Feature Normalization and Data Augmentation

Arxiv

14+阅读 · 2020年2月25日

Efficient Parameter-free Clustering Using First Neighbor Relations

Efficient Parameter-free Clustering Using First Neighbor Relations

Arxiv

7+阅读 · 2019年2月28日

ClusterGAN : Latent Space Clustering in Generative Adversarial Networks

ClusterGAN : Latent Space Clustering in Generative Adversarial Networks

Arxiv

9+阅读 · 2018年9月10日

Sparse and Constrained Attention for Neural Machine Translation

Arxiv

4+阅读 · 2018年5月21日

Generative Model for Heterogeneous Inference

Arxiv

4+阅读 · 2018年4月26日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

Negative Binomial Matrix Factorization for Recommender Systems

Arxiv

8+阅读 · 2018年1月5日

Multilingual Topic Models

Arxiv

3+阅读 · 2017年12月18日

SAR: Semantic Analysis for Recommendation

Arxiv

6+阅读 · 2017年12月2日

Deep CTR Prediction in Display Advertising

Arxiv

4+阅读 · 2016年9月20日

微信扫码咨询专知VIP会员