云层剖面:以海训方案为基础的云流工作量的跨节间剖析和高通量数据摄入量数据 (Cloudprofiler: TSC-based inter-node profiling and high-throughput data ingestion for cloud streaming workloads) - 专知论文

会员服务 ·

0

流 · 模型评估 · Engineering · TSC · Processing（编程语言） ·

2022 年 5 月 19 日

Cloudprofiler: TSC-based inter-node profiling and high-throughput data ingestion for cloud streaming workloads

翻译：云层剖面:以海训方案为基础的云流工作量的跨节间剖析和高通量数据摄入量数据

Shinhyung Yang,Jiun Jeong,Bernhard Scholz,Bernd Burgstaller

To conduct real-time analytics computations, big data stream processing engines are required to process unbounded data streams at millions of events per second. However, current streaming engines exhibit low throughput and high tuple processing latency. Performance engineering is complicated by the fact that streaming engines constitute complex distributed systems consisting of multiple nodes in the cloud. A profiling technique is required that is capable of measuring time durations at high accuracy across nodes. Standard clock synchronization techniques such as the network time protocol (NTP) are limited to millisecond accuracy, and hence cannot be used. We propose a profiling technique that relates the time-stamp counters (TSCs) of nodes to measure the duration of events in a streaming framework. The precision of the TSC relation determines the accuracy of the measured duration. The TSC relation is conducted in quiescent periods of the network to achieve accuracy in the tens of microseconds. We propose a throughput-controlled data generator to reliably determine the sustainable throughput of a streaming engine. To facilitate high-throughput data ingestion, we propose a concurrent object factory that moves the deserialization overhead of incoming data tuples off the critical path of the streaming framework. The evaluation of the proposed techniques within the Apache Storm streaming framework on the Google Compute Engine public cloud shows that data ingestion increases from $700$ $\text{k}$ to $4.68$ $\text{M}$ tuples per second, and that time durations can be profiled at a measurement accuracy of $92$ $\mu\text{s}$, which is three orders of magnitude higher than the accuracy of NTP, and one order of magnitude higher than prior work.

翻译：进行实时分析计算时, 需要大数据流处理引擎来处理以百万秒计的事件。然而, 当前流流引擎显示的是低输送量和高通量处理的延迟度。由于流流引擎构成由云层多个节点组成的复杂分布系统, 执行工程复杂。需要一种能够测量节点之间高精确度时间长度的剖析技术。标准时钟同步技术, 如网络时间协议( NTP) 限为毫秒的精确度, 因此无法使用。我们提议了一种将时端端端的节点计( TSC) 的准确度测量值与流框架内的事件持续时间长度相连接的技术。 TSC 的精确度决定了测量时间长度的准确性。 TSC 是在网络的宽度期间进行, 以精确度计时段。我们提议一个通过量控制的数据生成器, 以可靠地确定流动引擎的可持续通量。为了便利高通量的测量, 我们提议在流中同时建立一个目标工厂, 以美元美元, 在流状框架内将数据流端端端端端端端端端端的。

0

相关内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

专知会员服务

22+阅读 · 2022年2月19日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

立体多面圆极化天线

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Aubry-Mather理论在弱光滑平面微分系统中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

TNF-α/NF-κB介导的线粒体凋亡途径在蜂毒肽致急性肾损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

幂零李群上热核估计的几个问题

国家自然科学基金

0+阅读 · 2012年12月31日

马铃薯野生种（Solanum pinnatisectum）块茎变紫的光谱效应及其分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

有理曲面及其相关问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

探寻与高功能孤独症和Asperger综合征相关的拷贝数变异

国家自然科学基金

0+阅读 · 2009年12月31日

bqror: An R package for Bayesian Quantile Regression in Ordinal Models

Arxiv

0+阅读 · 2022年7月7日

Using Microservice Telemetry Data for System Dynamic Analysis

Arxiv

0+阅读 · 2022年7月6日

A Hybrid Approach for Binary Classification of Imbalanced Data

Arxiv

0+阅读 · 2022年7月6日

A Recurrent Differentiable Engine for Modeling Tensegrity Robots Trainable with Low-Frequency Data

Arxiv

0+阅读 · 2022年7月6日

OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning

Arxiv

0+阅读 · 2022年7月5日

Admissibility in Strength-based Argumentation: Complexity and Algorithms (Extended Version with Proofs)

Arxiv

0+阅读 · 2022年7月5日

Comparing the Utility and Disclosure Risk of Synthetic Data with Samples of Microdata

Arxiv

0+阅读 · 2022年7月2日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

Processing（编程语言）

相关VIP内容

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

75+阅读 · 2022年6月28日

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

71页PDF，Intro to the Metaverse（元宇宙概念发展透析），Newzoo Trend Report 2021

专知会员服务

22+阅读 · 2022年2月19日

2020数据工程师成长路线图

专知会员服务

19+阅读 · 2020年9月6日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

2019年机器学习框架回顾

2019年机器学习框架回顾

专知会员服务

36+阅读 · 2019年10月11日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

AI CITY发展研究报告：“人工智能+”时代的智慧城市发展范式创新（2025年）

风格迁移：十年综述

【ICCV2025】CL-Splats：结合局部优化的高斯泼洒持续学习方法

【HKUST博士论文】迈向可扩展且具泛化能力的时空预测

相关资讯

ACM TOMM Call for Papers

ACM TOMM Call for Papers

CCF多媒体专委会

2+阅读 · 2022年3月23日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium3

中国图象图形学学会CSIG

0+阅读 · 2021年11月9日

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

【ICIG2021】Check out the hot new trailer of ICIG2021 Symposium1

中国图象图形学学会CSIG

0+阅读 · 2021年11月3日

【ICIG2021】Latest News & Announcements of the Industry Talk2

【ICIG2021】Latest News & Announcements of the Industry Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年7月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

bqror: An R package for Bayesian Quantile Regression in Ordinal Models

Arxiv

0+阅读 · 2022年7月7日

Using Microservice Telemetry Data for System Dynamic Analysis

Arxiv

0+阅读 · 2022年7月6日

A Hybrid Approach for Binary Classification of Imbalanced Data

Arxiv

0+阅读 · 2022年7月6日

A Recurrent Differentiable Engine for Modeling Tensegrity Robots Trainable with Low-Frequency Data

Arxiv

0+阅读 · 2022年7月6日

OpenLDN: Learning to Discover Novel Classes for Open-World Semi-Supervised Learning

Arxiv

0+阅读 · 2022年7月5日

Admissibility in Strength-based Argumentation: Complexity and Algorithms (Extended Version with Proofs)

Arxiv

0+阅读 · 2022年7月5日

Comparing the Utility and Disclosure Risk of Synthetic Data with Samples of Microdata

Arxiv

0+阅读 · 2022年7月2日

Updating Embeddings for Dynamic Knowledge Graphs

Arxiv

20+阅读 · 2021年9月22日

Scaling Properties of Deep Residual Networks

Arxiv

13+阅读 · 2021年5月25日

Class-Balanced Loss Based on Effective Number of Samples

Arxiv

12+阅读 · 2019年1月16日

相关基金

立体多面圆极化天线

国家自然科学基金

1+阅读 · 2015年12月31日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

2+阅读 · 2015年12月31日

Aubry-Mather理论在弱光滑平面微分系统中的应用

国家自然科学基金

0+阅读 · 2014年12月31日

具有临界指数的Schrodinger-Poisson系统的解

国家自然科学基金

0+阅读 · 2013年12月31日

TNF-α/NF-κB介导的线粒体凋亡途径在蜂毒肽致急性肾损伤中的作用机制研究

国家自然科学基金

0+阅读 · 2013年12月31日

Schrodinger-Poisson方程的若干问题研究

国家自然科学基金

1+阅读 · 2012年12月31日

幂零李群上热核估计的几个问题

国家自然科学基金

0+阅读 · 2012年12月31日

马铃薯野生种（Solanum pinnatisectum）块茎变紫的光谱效应及其分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

有理曲面及其相关问题的研究

国家自然科学基金

0+阅读 · 2012年12月31日

探寻与高功能孤独症和Asperger综合征相关的拷贝数变异

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员