多核平台上的动态访存优化 - 专知基金

会员服务 ·

0

访存优化 · 动态优化 · 调试 ·

2013 年 12 月 31 日

多核平台上的动态访存优化

国家自然科学基金

国家自然科学基金委员会

项目名称： 多核平台上的动态访存优化

项目编号： No.61303051

项目类型： 青年科学基金项目

立项/批准年度： 2014

项目学科： 自动化技术、计算机技术

项目作者： 王振江

作者单位： 中国科学院计算技术研究所

项目金额： 27万元

中文摘要： 长期以来，处理器和内存间的速度差异一直是计算机系统的主要性能瓶颈。多核平台增强了总运算能力却没有同比提高访存带宽，使带宽又成为另一个瓶颈。分析程序在运行时的访存行为，利用动态优化技术缓解访存"慢"和"挤"的问题，是具有重要研究价值的课题。项目准备在以下三点寻求突破：(1)多线程内存池优化。通过动态运行信息抓取和数据布局，在改善数据局部性的同时，适应多线程程序的不同访存模式，检测、预测和处理假共享。(2)数据动态重布局。研究运行时安全、低开销的移动数据的方法，适应不断变化的热访存序列。(3)访存竞争模型和任务调度算法。为弥补单一数据指标的偏差和局限性，研究综合了多项指标的访存竞争模型，为任务调度等优化提供更好的指导。研究成果将从分配时和分配后的数据布局、访存竞争等不同角度缓解存储墙问题，提高计算机系统的整体性能。

中文关键词： 多核；访存优化；动态优化；剖析；调试

英文摘要： The speed gap between process and memory has long been the main performance bottleneck of computer system. Multi-core architecture improves the process capability, but not the memory bandwidth, making the latter another bottleneck. Analyzing runtime memory behavior and relieving the memory problems by dynamic optimization is of great research value. We will research on the following issues. (1) Multi-threaded memory pool allocation. It profiles runtime information and manages data layout at allocation time. Besides, it can fit various memory access characteristics in multi-threaded programs, and can detect, predict and handle false sharing. (2) Dynamic data layout rearrangement. It can safely move data at runtime with little overhead when hot memory access sequence changes. (3) Memory contention model and scheduling algorithm. Instead of a limited single metric, the model uses several metrics and describes the degree of memory contention. It can provide better direction to task scheduling as well as some other optimizations. The research targets the memory wall problem from various aspects, including allocation time, post allocation time, and memory contention. The research will improve the overall performance of computer system.

英文关键词： multi-core；memory optimization；dynamic optimization；profiling；debugging

成为VIP会员查看完整内容

0

相关内容

访存优化

CVPR 2022 Oral | 南京大学AdaMixer：基于快速收敛查询的目标检测器

CVPR 2022 Oral | 南京大学AdaMixer：基于快速收敛查询的目标检测器

专知会员服务

11+阅读 · 2022年4月10日

【AAAI2022】基于双流更新的视觉Transformer动态加速方法

【AAAI2022】基于双流更新的视觉Transformer动态加速方法

专知会员服务

24+阅读 · 2021年12月11日

面向大数据处理框架的JVM优化技术综述

面向大数据处理框架的JVM优化技术综述

专知会员服务

17+阅读 · 2021年11月27日

【ICML2021】贝叶斯结构自适应的持续学习

专知会员服务

35+阅读 · 2021年9月18日

【ICML2021】面向增长数据的自适应神经架构

专知会员服务

25+阅读 · 2021年7月8日

图神经网络元学习

专知会员服务

97+阅读 · 2021年5月25日

清华大学等首篇「动态神经网络」最新综述论文，20页pdf236篇文献

清华大学等首篇「动态神经网络」最新综述论文，20页pdf236篇文献

专知会员服务

80+阅读 · 2021年2月21日

基于深度强化学习的组合优化研究进展

专知会员服务

89+阅读 · 2020年12月11日

卷积神经网络结构优化综述

专知会员服务

80+阅读 · 2020年8月4日

【伯克利尤洋博士论文】《快速机器学习训练算法》189页pdf

【伯克利尤洋博士论文】《快速机器学习训练算法》189页pdf

专知会员服务

54+阅读 · 2020年8月4日

Transformer性能优化：运算和显存

Transformer性能优化：运算和显存

PaperWeekly

1+阅读 · 2022年3月29日

工程实践 | CUDA优化之LayerNorm性能优化实践

工程实践 | CUDA优化之LayerNorm性能优化实践

极市平台

0+阅读 · 2022年1月10日

浅谈BERT/Transformer模型的压缩与优化加速

浅谈BERT/Transformer模型的压缩与优化加速

PaperWeekly

1+阅读 · 2021年12月31日

PyTorch | 优化神经网络训练的17种方法

PyTorch | 优化神经网络训练的17种方法

极市平台

3+阅读 · 2021年12月30日

性能提升10倍以上：阿里达摩院成功研发新型存算一体芯片

性能提升10倍以上：阿里达摩院成功研发新型存算一体芯片

极市平台

1+阅读 · 2021年12月5日

【博士论文】持久性内存存储系统关键技术研究

【博士论文】持久性内存存储系统关键技术研究

专知

2+阅读 · 2021年11月24日

400倍加速, PolarDB HTAP实时数据分析技术解密

400倍加速, PolarDB HTAP实时数据分析技术解密

阿里技术

0+阅读 · 2021年10月25日

基于Pytorch的开源推荐算法库

基于Pytorch的开源推荐算法库

机器学习与推荐算法

1+阅读 · 2021年10月12日

社区分享｜TensorFlow 在推荐场景中的推理性能优化

社区分享｜TensorFlow 在推荐场景中的推理性能优化

TensorFlow

3+阅读 · 2021年9月17日

亿级订单数据的访问与存储，怎么实现与优化？

亿级订单数据的访问与存储，怎么实现与优化？

码农翻身

16+阅读 · 2019年4月17日

面向存储受限应用的GPU性能预测模型和通信优化关键技术研究

国家自然科学基金

2+阅读 · 2015年12月31日

混合存储和计算模式下的大图处理优化技术研究

国家自然科学基金

1+阅读 · 2014年12月31日

GPU程序访存行为分析和优化关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

集群环境下基于内存的高性能数据管理与分析

国家自然科学基金

0+阅读 · 2013年12月31日

混合云中的数据密集型工作流调度策略研究

国家自然科学基金

1+阅读 · 2013年12月31日

三维集成电路热可靠性快速分析与在线优化技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于纠删码的大规模存储集群重构优化技术

国家自然科学基金

0+阅读 · 2013年12月31日

面向高精度计算领域动态可配置加速器体系结构研究

国家自然科学基金

0+阅读 · 2013年12月31日

访存模式感知的自适应智能存储体系结构及关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型体系结构下数据密集型计算的运行时优化机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

Unsupervised detection of ash dieback disease (Hymenoscyphus fraxineus) using diffusion-based hyperspectral image clustering

Unsupervised detection of ash dieback disease (Hymenoscyphus fraxineus) using diffusion-based hyperspectral image clustering

Arxiv

0+阅读 · 2022年4月19日

Semi-supervised 3D shape segmentation with multilevel consistency and part substitution

Arxiv

0+阅读 · 2022年4月19日

StableMoE: Stable Routing Strategy for Mixture of Experts

Arxiv

0+阅读 · 2022年4月18日

Transductive Learning for Abstractive News Summarization

Arxiv

0+阅读 · 2022年4月16日

Structured Gradient Descent for Fast Robust Low-Rank Hankel Matrix Completion

Arxiv

0+阅读 · 2022年4月15日

Decoupling Zero-Shot Semantic Segmentation

Arxiv

0+阅读 · 2022年4月15日

Retrieve-then-extract Based Knowledge Graph Querying Using Graph Neural Networks

Arxiv

1+阅读 · 2022年4月15日

Large-Scale Deep Learning Optimizations: A Comprehensive Survey

Arxiv

23+阅读 · 2021年11月2日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Self-Attention with Relative Position Representations

Arxiv

27+阅读 · 2018年4月12日

阅读: 0 点赞: 0

小贴士

登录享主题订阅及个性化推荐

相关主题

热门VIP内容

开通专知VIP会员享更多权益服务

新型数字杀伤链：理解综合战术网络对野战炮兵体系的能力与效益

《对抗环境中运用数字孪生技术优化预测性维护与后勤保障》2025最新93页

《任务式指挥十六个案例研究》232页

《幻觉还是事实：国防大型语言模型的可信度评估研究》2025最新109页

相关VIP内容

CVPR 2022 Oral | 南京大学AdaMixer：基于快速收敛查询的目标检测器

CVPR 2022 Oral | 南京大学AdaMixer：基于快速收敛查询的目标检测器

专知会员服务

11+阅读 · 2022年4月10日

【AAAI2022】基于双流更新的视觉Transformer动态加速方法

【AAAI2022】基于双流更新的视觉Transformer动态加速方法

专知会员服务

24+阅读 · 2021年12月11日

面向大数据处理框架的JVM优化技术综述

面向大数据处理框架的JVM优化技术综述

专知会员服务

17+阅读 · 2021年11月27日

【ICML2021】贝叶斯结构自适应的持续学习

专知会员服务

35+阅读 · 2021年9月18日

【ICML2021】面向增长数据的自适应神经架构

专知会员服务

25+阅读 · 2021年7月8日

图神经网络元学习

专知会员服务

97+阅读 · 2021年5月25日

清华大学等首篇「动态神经网络」最新综述论文，20页pdf236篇文献

清华大学等首篇「动态神经网络」最新综述论文，20页pdf236篇文献

专知会员服务

80+阅读 · 2021年2月21日

基于深度强化学习的组合优化研究进展

专知会员服务

89+阅读 · 2020年12月11日

卷积神经网络结构优化综述

专知会员服务

80+阅读 · 2020年8月4日

【伯克利尤洋博士论文】《快速机器学习训练算法》189页pdf

【伯克利尤洋博士论文】《快速机器学习训练算法》189页pdf

专知会员服务

54+阅读 · 2020年8月4日

相关资讯

Transformer性能优化：运算和显存

Transformer性能优化：运算和显存

PaperWeekly

1+阅读 · 2022年3月29日

工程实践 | CUDA优化之LayerNorm性能优化实践

工程实践 | CUDA优化之LayerNorm性能优化实践

极市平台

0+阅读 · 2022年1月10日

浅谈BERT/Transformer模型的压缩与优化加速

浅谈BERT/Transformer模型的压缩与优化加速

PaperWeekly

1+阅读 · 2021年12月31日

PyTorch | 优化神经网络训练的17种方法

PyTorch | 优化神经网络训练的17种方法

极市平台

3+阅读 · 2021年12月30日

性能提升10倍以上：阿里达摩院成功研发新型存算一体芯片

性能提升10倍以上：阿里达摩院成功研发新型存算一体芯片

极市平台

1+阅读 · 2021年12月5日

【博士论文】持久性内存存储系统关键技术研究

【博士论文】持久性内存存储系统关键技术研究

专知

2+阅读 · 2021年11月24日

400倍加速, PolarDB HTAP实时数据分析技术解密

400倍加速, PolarDB HTAP实时数据分析技术解密

阿里技术

0+阅读 · 2021年10月25日

基于Pytorch的开源推荐算法库

基于Pytorch的开源推荐算法库

机器学习与推荐算法

1+阅读 · 2021年10月12日

社区分享｜TensorFlow 在推荐场景中的推理性能优化

社区分享｜TensorFlow 在推荐场景中的推理性能优化

TensorFlow

3+阅读 · 2021年9月17日

亿级订单数据的访问与存储，怎么实现与优化？

亿级订单数据的访问与存储，怎么实现与优化？

码农翻身

16+阅读 · 2019年4月17日

相关基金

面向存储受限应用的GPU性能预测模型和通信优化关键技术研究

国家自然科学基金

2+阅读 · 2015年12月31日

混合存储和计算模式下的大图处理优化技术研究

国家自然科学基金

1+阅读 · 2014年12月31日

GPU程序访存行为分析和优化关键技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

集群环境下基于内存的高性能数据管理与分析

国家自然科学基金

0+阅读 · 2013年12月31日

混合云中的数据密集型工作流调度策略研究

国家自然科学基金

1+阅读 · 2013年12月31日

三维集成电路热可靠性快速分析与在线优化技术研究

国家自然科学基金

1+阅读 · 2013年12月31日

基于纠删码的大规模存储集群重构优化技术

国家自然科学基金

0+阅读 · 2013年12月31日

面向高精度计算领域动态可配置加速器体系结构研究

国家自然科学基金

0+阅读 · 2013年12月31日

访存模式感知的自适应智能存储体系结构及关键技术研究

国家自然科学基金

0+阅读 · 2013年12月31日

新型体系结构下数据密集型计算的运行时优化机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

相关论文

Unsupervised detection of ash dieback disease (Hymenoscyphus fraxineus) using diffusion-based hyperspectral image clustering

Unsupervised detection of ash dieback disease (Hymenoscyphus fraxineus) using diffusion-based hyperspectral image clustering

Arxiv

0+阅读 · 2022年4月19日

Semi-supervised 3D shape segmentation with multilevel consistency and part substitution

Arxiv

0+阅读 · 2022年4月19日

StableMoE: Stable Routing Strategy for Mixture of Experts

Arxiv

0+阅读 · 2022年4月18日

Transductive Learning for Abstractive News Summarization

Arxiv

0+阅读 · 2022年4月16日

Structured Gradient Descent for Fast Robust Low-Rank Hankel Matrix Completion

Arxiv

0+阅读 · 2022年4月15日

Decoupling Zero-Shot Semantic Segmentation

Arxiv

0+阅读 · 2022年4月15日

Retrieve-then-extract Based Knowledge Graph Querying Using Graph Neural Networks

Arxiv

1+阅读 · 2022年4月15日

Large-Scale Deep Learning Optimizations: A Comprehensive Survey

Arxiv

23+阅读 · 2021年11月2日

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

Arxiv

17+阅读 · 2020年6月2日

Self-Attention with Relative Position Representations

Arxiv

27+阅读 · 2018年4月12日

微信扫码咨询专知VIP会员