分阶段并行完全默示的龙格-库塔执行,在比例定值限制时采用最佳多级最佳先决条件 (Stage-parallel fully implicit Runge-Kutta implementations with optimal multilevel preconditioners at the scaling limit) - 专知论文

会员服务 ·

0

块 · 缩放 · Performer · 讲稿 · 优化器 ·

2022 年 9 月 14 日

Stage-parallel fully implicit Runge-Kutta implementations with optimal multilevel preconditioners at the scaling limit

翻译：分阶段并行完全默示的龙格-库塔执行,在比例定值限制时采用最佳多级最佳先决条件

Peter Munch,Ivo Dravins,Martin Kronbichler,Maya Neytcheva

We present an implementation of a fully stage-parallel preconditioner for Radau IIA type fully implicit Runge--Kutta methods, which approximates the inverse of $A_Q$ from the Butcher tableau by the lower triangular matrix resulting from an LU decomposition and diagonalizes the system with as many blocks as stages. For the transformed system, we employ a block preconditioner where each block is distributed and solved by a subgroup of processes in parallel. For combination of partial results, we either use a communication pattern resembling Cannon's algorithm or shared memory. A performance model and a large set of performance studies (including strong scaling runs with up to 150k processes on 3k compute nodes) conducted for a time-dependent heat problem, using matrix-free finite element methods, indicate that the stage-parallel implementation can reach higher throughputs when the block solvers operate at lower parallel efficiencies, which occurs near the scaling limit. Achievable speedup increases linearly with number of stages and are bounded by the number of stages. Furthermore, we show that the presented stage-parallel concepts are also applicable to the case that $A_Q$ is directly diagonalized, which requires complex arithmetic or the solution of two-by-two blocks and sequentializes parts of the algorithm. Alternatively to distributing stages and assigning them to distinct processes, we discuss the possibility of batching operations from different stages together.

翻译：我们为Radau IIA 类型完全隐含的龙格-库塔方法推出了一个完全阶段和平行的预设条件,该预设条件通过LU分解和分解系统,以各个阶段的多个区块对系统进行分解和分解,使Butcher 台面的较低三角矩阵,与Butcher 台面上美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=美元=千分之一;对系统转型系统,我们使用一个区块的先决条件,每个区块由平行的进程分组分配和解决;对部分结果的结合,我们要么使用类似于Cannonon的算法或共享记忆的通信模式;一个业绩模型和一套大型的绩效研究(包括以3k compute 节点为3k 节点为至多150k 的流程进行强有力的缩放宽度计算方法),这些模型和一套大型的绩效研究的反差差差的三角矩阵,其相偏差的三角矩阵实施过程在区块解决问题时可以达到更高的分数;当区块解决问题时,在缩小的分级阶段和分级之间进行分级分析时,我们可直接讨论。

0

相关内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

3D多孔结构LiMnPO4•LiVPO4F@石墨烯气凝胶复合物材料的构筑及电化学性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

CaHsfA2和CaHsfA6b转录因子对辣椒温敏雄性不育系育性转换的调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

组蛋白乙酰转移酶OsglHAT1调控水稻粒型和粒重的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

转录因子Ste12调控玉米大斑病菌侵染过程的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

马铃薯茎溃疡病原菌毒素的鉴定及其作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Tecto调节非洲爪蛙胚层决定与分化的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于WRF-CHEM模式与OMI卫星数据验证的氮氧化物排放及污染时空分布特征

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

Regularized Nonlinear Regression with Dependent Errors and its Application to a Biomechanical Model

Arxiv

0+阅读 · 2022年10月24日

Private Online Prediction from Experts: Separations and Faster Rates

Arxiv

0+阅读 · 2022年10月24日

Decimated Prony's Method for Stable Super-resolution

Arxiv

0+阅读 · 2022年10月24日

Shapley effect estimation in reliability-oriented sensitivity analysis with correlated inputs by importance sampling

Arxiv

0+阅读 · 2022年10月24日

Learning constitutive models from microstructural simulations via a non-intrusive reduced basis method: Extension to geometrical parameterizations

Arxiv

0+阅读 · 2022年10月24日

Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee

Arxiv

0+阅读 · 2022年10月23日

Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence

Arxiv

0+阅读 · 2022年10月23日

De-Biased Machine Learning of Global and Local Parameters Using Regularized Riesz Representers

Arxiv

0+阅读 · 2022年10月21日

Numerical rank of kernel functions

Arxiv

0+阅读 · 2022年10月21日

Bayesian Inverse Problems with Heterogeneous Variance

Arxiv

0+阅读 · 2022年10月20日

VIP会员

文章信息

相关主题

相关VIP内容

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

51+阅读 · 2020年12月14日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

【新书】数字图像(影像)处理手第二版，2176pdf，Mathematical Methods in Imaging

专知会员服务

93+阅读 · 2020年2月12日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

83+阅读 · 2019年10月9日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】利用强化学习与生成模型推进可靠且可泛化的决策

美海军研发“增强侦察与态势评估系统（ARES）”应用程序以优化作战规划（附研究论文）

【NeurIPS2025】DNA-DetectLLM：基于 DNA 启发的“突变-修复”范式揭示 AI 生成文本

面向深度研究系统的强化学习基础：综述

相关资讯

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

征稿 | CFP：Special Issue of NLP and KG(JCR Q2，IF2.67)

开放知识图谱

1+阅读 · 2022年4月4日

ACM MM 2022 Call for Papers

ACM MM 2022 Call for Papers

CCF多媒体专委会

5+阅读 · 2022年3月29日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

【推荐】YOLO实时目标检测(6fps)

【推荐】YOLO实时目标检测(6fps)

机器学习研究会

20+阅读 · 2017年11月5日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Regularized Nonlinear Regression with Dependent Errors and its Application to a Biomechanical Model

Arxiv

0+阅读 · 2022年10月24日

Private Online Prediction from Experts: Separations and Faster Rates

Arxiv

0+阅读 · 2022年10月24日

Decimated Prony's Method for Stable Super-resolution

Arxiv

0+阅读 · 2022年10月24日

Shapley effect estimation in reliability-oriented sensitivity analysis with correlated inputs by importance sampling

Arxiv

0+阅读 · 2022年10月24日

Learning constitutive models from microstructural simulations via a non-intrusive reduced basis method: Extension to geometrical parameterizations

Arxiv

0+阅读 · 2022年10月24日

Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee

Arxiv

0+阅读 · 2022年10月23日

Symmetric (Optimistic) Natural Policy Gradient for Multi-agent Learning with Parameter Convergence

Arxiv

0+阅读 · 2022年10月23日

De-Biased Machine Learning of Global and Local Parameters Using Regularized Riesz Representers

Arxiv

0+阅读 · 2022年10月21日

Numerical rank of kernel functions

Arxiv

0+阅读 · 2022年10月21日

Bayesian Inverse Problems with Heterogeneous Variance

Arxiv

0+阅读 · 2022年10月20日

相关基金

3D多孔结构LiMnPO4•LiVPO4F@石墨烯气凝胶复合物材料的构筑及电化学性能研究

国家自然科学基金

0+阅读 · 2015年12月31日

CaHsfA2和CaHsfA6b转录因子对辣椒温敏雄性不育系育性转换的调控机制

国家自然科学基金

0+阅读 · 2014年12月31日

组蛋白乙酰转移酶OsglHAT1调控水稻粒型和粒重的分子机制

国家自然科学基金

0+阅读 · 2014年12月31日

Mumford-Shah型图像分割问题研究

国家自然科学基金

0+阅读 · 2013年12月31日

转录因子Ste12调控玉米大斑病菌侵染过程的分子机制

国家自然科学基金

0+阅读 · 2012年12月31日

柽柳Dof转录因子的耐盐调控机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

马铃薯茎溃疡病原菌毒素的鉴定及其作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

Tecto调节非洲爪蛙胚层决定与分化的机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于WRF-CHEM模式与OMI卫星数据验证的氮氧化物排放及污染时空分布特征

国家自然科学基金

0+阅读 · 2009年12月31日

TR3相互作用新蛋白机理研究

国家自然科学基金

1+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员