Since very few contributions to the development of an unified memory orchestration framework for efficient management of both host and remote idle memory have been made, we present Valet, an efficient approach to orchestration of host and remote shared memory for improving performance of memory intensive workloads. The paper makes three original contributions. First, we redesign the data flow in the critical path by introducing a host-coordinated memory pool that works as a local cache to reduce the latency in the critical path of the host and remote memory orchestration. Second, Valet utilizes unused local memory across containers by managing local memory via Valet host-coordinated memory pool, which allows containers to dynamically expand and shrink their memory allocations according to the workload demands. Third, Valet provides an efficient remote memory reclaiming technique on remote peers, based on two optimizations: (1) an activity-based victim selection scheme to allow the least-active-chunk of data to be selected for serving the eviction requests and (2) a migration protocol to move the least-active-chunk of data to less-memory-pressured remote node. As a result, Valet can effectively reduce the performance impact and migration overhead on local nodes. Our extensive experiments on both NoSQL systems and Machine Learning (ML) workloads show that Valet outperforms existing representative remote paging systems with up to 226X throughput improvement and up to 98% latency decrease over conventional OS swap facility for big data and ML workloads, and by up to 5.5X throughput improvement and up to 78.4% latency decrease over the state-of-the-art remote paging systems. Valet is open sourced at https://github.com/git-disl/Valet.
翻译:由于对开发统一的记忆调控框架的贡献很少,因此,我们为有效管理主机和远程闲置记忆做出了非常少的贡献,因此,我们介绍了Valet,这是为改进记忆密集工作量的性能而对主机和远程共享记忆的调控的一种高效方法,该文件最初做出了三项贡献。首先,我们重新设计了关键路径的数据流,为此采用了一个主机协调的记忆库,作为本地缓存库,以减少主机关键路径和远程记忆调控。第二,Valet通过Valet主机协调的内存库管理本地记忆库,从而使得集装箱能够根据工作量需求动态扩展和缩小其记忆分配。第三,Valet提供了一种高效的远程存储回溯技术,其依据是两个优化:(1) 基于活动的受害人选择机制,使最不活跃的数据集能够被选用以满足驱逐请求,(2) 将最不活跃的内置系统升级至低调的平流压式远程节流系统。因此,Valet能够有效地降低远程操作的影响,并通过当地MARL系统进行升级的测试,从而降低对当前系统的影响和迁移。