Memory resources in data centers generally suffer from low utilization and lack of dynamics. Memory disaggregation solves these problems by decoupling CPU and memory, which currently includes approaches based on RDMA or interconnection protocols such as Compute Express Link (CXL). However, the RDMA-based approach involves code refactoring and higher latency. The CXL-based approach supports native memory semantics and overcomes the shortcomings of RDMA, but is limited within rack level. In addition, memory pooling and sharing based on CXL products are currently in the process of early exploration and still take time to be available in the future. In this paper, we propose the CXL over Ethernet approach that the host processor can access the remote memory with memory semantics through Ethernet. Our approach can support native memory load/store access and extends the physical range to cross server and rack levels by taking advantage of CXL and RDMA technologies. We prototype our approach with one server and two FPGA boards with 100 Gbps network and measure the memory access latency. Furthermore, we optimize the memory access path by using data cache and congestion control algorithm in the critical path to further lower access latency. The evaluation results show that the average latency for the server to access remote memory is 1.97 {\mu}s, which is about 37% lower than the baseline latency in the industry. The latency can be further reduced to 415 ns with cache block and hit access on FPGA.
翻译:数据中心的内存资源一般受到低利用率和缺乏动态的影响。内存分解通过解开CPU和内存解决了这些问题,目前包括基于RDMA或互连协议的方法,如Compute Express Link(CXL) 。然而,基于 RDMA 的方法涉及代码再定位和更高的延缓性。基于 CXL 的方法支持本地记忆的语义,克服了RDMA的缺陷,但在括号级内是有限的。此外,基于 CXL 产品的内存集合和共享正在早期探索过程中,仍然需要时间才能在将来找到。在本文中,我们建议基于RDMA 或RDMA 的内存协议,包括基于RDMMA 和 DXMA 的内存资源。此外,基于 CXLL和RFMA 技术,我们与一个服务器和两个FPGA董事会的内存共享率较低,现在仍然需要时间才能使用。我们提议CXLL 以Ethernet 方法,使主机能够通过 Ethernet 获取远程存储。此外,我们用主机的CL