DNA合成的批批量优化 (Batch Optimization for DNA Synthesis) - 专知论文

会员服务 ·

0

优化器 · Storage · CASE · 可约的 · 代价 ·

2020 年 11 月 30 日

Batch Optimization for DNA Synthesis

翻译：DNA合成的批批量优化

Konstantin Makarychev,Miklos Z. Racz,Cyrus Rashtchian,Sergey Yekhanin

Large pools of synthetic DNA molecules have been recently used to reliably store significant volumes of digital data. While DNA as a storage medium has enormous potential because of its high storage density, its practical use is currently severely limited because of the high cost and low throughput of available DNA synthesis technologies. We study the role of batch optimization in reducing the cost of large scale DNA synthesis, which translates to the following algorithmic task. Given a large pool $\mathcal{S}$ of random quaternary strings of fixed length, partition $\mathcal{S}$ into batches in a way that minimizes the sum of the lengths of the shortest common supersequences across batches. We introduce two ideas for batch optimization that both improve (in different ways) upon a naive baseline: (1) using both $(ACGT)^{*}$ and its reverse $(TGCA)^{*}$ as reference strands, and batching appropriately, and (2) batching via the quantiles of an appropriate ordering of the strands. We also prove asymptotically matching lower bounds on the cost of DNA synthesis, showing that one cannot improve upon these two ideas. Our results uncover a surprising separation between two cases that naturally arise in the context of DNA data storage: the asymptotic cost savings of batch optimization are significantly greater in the case where strings in $\mathcal{S}$ do not contain repeats of the same character (homopolymers), as compared to the case where strings in $\mathcal{S}$ are unconstrained.

翻译：大量合成DNA分子最近被用来可靠地存储大量数字数据。虽然DNA作为一种存储介质由于存储密度高而具有巨大的潜力,但其实际用途目前由于现有DNA合成技术的成本高和吞吐量低而受到严重限制。我们研究了批量优化在降低大规模DNA合成成本方面的作用,这转化成以下算法任务。鉴于一个大批量库$\mathcal{S}$的固定长度随机四边字符串,将$\mathcal{S}美元分解成批量,从而最大限度地减少每批中最短的共同超级序列的长度之和。我们引入了两种批量优化的想法,既能(以不同的方式)改进天真基线,又能(以美元)和反向的DNA合成成本。鉴于一个大库是固定长度的四边线,我们还证明在DNA合成成本上与非重复的一环。相比,在两个类量的存储中,我们无法大幅改进两个DNA的序号,在两个序列中,在两个序列中,我们将DNA的存储结果在两个序列中产生惊人的分解。

0

相关内容

优化器

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

122+阅读 · 2020年11月20日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

图机器学习导论，69页ppt，An introduction to machine learning on graphs

图机器学习导论，69页ppt，An introduction to machine learning on graphs

专知会员服务

377+阅读 · 2019年12月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

32+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

57+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

79+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

39+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

270+阅读 · 2019年10月9日

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Local Search Algorithms for Rank-Constrained Convex Optimization

Arxiv

0+阅读 · 2021年1月15日

An Answer to the Bose-Nelson Sorting Problem for 11 and 12 Channels

Arxiv

0+阅读 · 2021年1月15日

Improved Rank-Modulation Codes for DNA Storage with Shotgun Sequencing

Arxiv

0+阅读 · 2021年1月15日

A hierarchical expected improvement method for Bayesian optimization

Arxiv

0+阅读 · 2021年1月14日

VINNAS: Variational Inference-based Neural Network Architecture Search

Arxiv

0+阅读 · 2021年1月14日

Adaptive Estimation of Multivariate Piecewise Polynomials and Bounded Variation Functions by Optimal Decision Trees

Arxiv

0+阅读 · 2021年1月14日

Whispered and Lombard Neural Speech Synthesis

Arxiv

0+阅读 · 2021年1月13日

Insertion-based Decoding with automatically Inferred Generation Order

Arxiv

5+阅读 · 2019年2月28日

Neural Architecture Optimization

Neural Architecture Optimization

Arxiv

8+阅读 · 2018年9月5日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

VIP会员

文章信息

相关主题

相关VIP内容

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

122+阅读 · 2020年11月20日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

图机器学习导论，69页ppt，An introduction to machine learning on graphs

图机器学习导论，69页ppt，An introduction to machine learning on graphs

专知会员服务

377+阅读 · 2019年12月27日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

45+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

32+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

57+阅读 · 2019年10月17日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

79+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

39+阅读 · 2019年10月9日

MIT新书《强化学习与最优控制》

MIT新书《强化学习与最优控制》

专知会员服务

270+阅读 · 2019年10月9日

热门VIP内容

相关资讯

BERT/Transformer/迁移学习NLP资源大列表

BERT/Transformer/迁移学习NLP资源大列表

专知

19+阅读 · 2019年6月9日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

17+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Disentangled的假设的探讨

Disentangled的假设的探讨

CreateAMind

9+阅读 · 2018年12月10日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

gan生成图像at 1024² 的代码论文

gan生成图像at 1024² 的代码论文

CreateAMind

4+阅读 · 2017年10月31日

最佳实践：深度学习用于自然语言处理（三）

最佳实践：深度学习用于自然语言处理（三）

待字闺中

3+阅读 · 2017年8月20日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Local Search Algorithms for Rank-Constrained Convex Optimization

Arxiv

0+阅读 · 2021年1月15日

An Answer to the Bose-Nelson Sorting Problem for 11 and 12 Channels

Arxiv

0+阅读 · 2021年1月15日

Improved Rank-Modulation Codes for DNA Storage with Shotgun Sequencing

Arxiv

0+阅读 · 2021年1月15日

A hierarchical expected improvement method for Bayesian optimization

Arxiv

0+阅读 · 2021年1月14日

VINNAS: Variational Inference-based Neural Network Architecture Search

Arxiv

0+阅读 · 2021年1月14日

Adaptive Estimation of Multivariate Piecewise Polynomials and Bounded Variation Functions by Optimal Decision Trees

Arxiv

0+阅读 · 2021年1月14日

Whispered and Lombard Neural Speech Synthesis

Arxiv

0+阅读 · 2021年1月13日

Insertion-based Decoding with automatically Inferred Generation Order

Arxiv

5+阅读 · 2019年2月28日

Neural Architecture Optimization

Neural Architecture Optimization

Arxiv

8+阅读 · 2018年9月5日

The Search Problem in Mixture Models

Arxiv

3+阅读 · 2018年2月24日

微信扫码咨询专知VIP会员