Aurora is Argonne National Laboratory's pioneering Exascale supercomputer, designed to accelerate scientific discovery with cutting-edge architectural innovations. Key new technologies include the Intel(TM) Xeon(TM) Data Center GPU Max Series (code-named Sapphire Rapids) with support for High Bandwidth Memory (HBM), alongside the Intel(TM) Data Center GPU Max Series (code-named Ponte Vecchio) on each compute node. Aurora also integrates the Distributed Asynchronous Object Storage (DAOS), a novel exascale storage solution, and leverages Intel's oneAPI programming environment. This paper presents an in-depth exploration of Aurora's node architecture, the HPE Slingshot interconnect, the supporting software ecosystem, and DAOS. We provide insights into standard benchmark performance and applications readiness efforts via Aurora's Early Science Program and the Exascale Computing Project.
翻译:Aurora是阿贡国家实验室开创性的百亿亿次超级计算机,旨在通过尖端的架构创新加速科学发现。其关键新技术包括支持高带宽内存(HBM)的英特尔™ 至强™ 数据中心GPU Max系列(代号Sapphire Rapids),以及每个计算节点上配备的英特尔™ 数据中心GPU Max系列(代号Ponte Vecchio)。Aurora还集成了分布式异步对象存储(DAOS)——一种新型的百亿亿级存储解决方案,并利用了英特尔oneAPI编程环境。本文深入探讨了Aurora的节点架构、HPE Slingshot互连技术、支持软件生态系统以及DAOS。我们通过Aurora早期科学计划和百亿亿次计算项目,提供了标准基准测试性能和应用准备工作的深入见解。