Parallel programming remains one of the most challenging aspects of High-Performance Computing (HPC), requiring deep knowledge of synchronization, communication, and memory models. While modern C++ standards and frameworks like OpenMP and MPI have simplified parallelism, mastering these paradigms is still complex. Recently, Large Language Models (LLMs) have shown promise in automating code generation, but their effectiveness in producing correct and efficient HPC code is not well understood. In this work, we systematically evaluate leading LLMs including ChatGPT 4 and 5, Claude, and LLaMA on the task of generating C++ implementations of the Mandelbrot set using shared-memory, directive-based, and distributed-memory paradigms. Each generated program is compiled and executed with GCC 11.5.0 to assess its correctness, robustness, and scalability. Results show that ChatGPT-4 and ChatGPT-5 achieve strong syntactic precision and scalable performance.
翻译:并行编程仍然是高性能计算(HPC)最具挑战性的领域之一,需要深入理解同步、通信和内存模型。尽管现代C++标准以及OpenMP和MPI等框架简化了并行化过程,但掌握这些范式依然复杂。近年来,大型语言模型(LLMs)在自动化代码生成方面展现出潜力,但其生成正确且高效HPC代码的能力尚未得到充分验证。本研究系统评估了包括ChatGPT 4与5、Claude和LLaMA在内的主流LLMs,测试其使用共享内存、基于指令以及分布式内存范式生成曼德博集C++实现的能力。每个生成程序均通过GCC 11.5.0编译执行,以评估其正确性、鲁棒性和可扩展性。结果表明,ChatGPT-4与ChatGPT-5在语法准确性和可扩展性能方面表现突出。