Adjoint algorithmic differentiation by operator and function overloading is based on the interpretation of directed acyclic graphs resulting from evaluations of numerical simulation programs. The size of the computer system memory required to store the graph grows proportional to the number of floating-point operations executed by the underlying program. It quickly exceeds the available memory resources. Naive adjoint algorithmic differentiation often becomes infeasible except for relatively simple numerical simulations. Access to the data associated with the graph can be classified as sequential and random. The latter refers to memory access patterns defined by the adjacency relationship between vertices within the graph. Sequentially accessed data can be decomposed into blocks. The blocks can be streamed across the system memory hierarchy thus extending the amount of available memory, for example, to hard discs. Asynchronous i/o can help to mitigate the increased cost due to accesses to slower memory. Much larger problem instances can thus be solved without resorting to technically challenging user intervention such as checkpointing. Randomly accessed data should not have to be decomposed. Its block-wise streaming is likely to yield a substantial overhead in computational cost due to data accesses across blocks. Consequently, the size of the randomly accessed memory required by an adjoint should be kept minimal in order to eliminate the need for decomposition. We propose a combination of dedicated memory for adjoint $L$-values with the exploitation of remainder bandwidth as a possible solution. Test results indicate significant savings in random access memory size while preserving overall computational efficiency.
翻译:操作员和功能超载的联合算法差异基于对数字模拟程序评价得出的定向循环图的解释。存储图形所需的计算机系统记忆量与基本程序所执行的浮点操作数量成比例。它迅速超过现有存储资源。除相对简单的数字模拟外,操作员和功能超载的合并算法差异往往变得不可行。访问与该图相关的数据可以被分类为顺序和随机。后者是指图表内垂直线之间的相近关系所定义的内存访问模式。从顺序上看,存储图形所需的计算机系统记忆量可以与基底程序所执行的浮点操作数量成比例。它可以将各个区块的内存量与数据数量成比例成比例成比例成比例成比例。在计算数据中,通过随机流流的流流流结果可能会产生一个相当高的内存量数据数量,因此,我们需要通过计算成本来减少内存量的内存量比例,从而降低内存量水平。