Loss functions are fundamental to learning accurate 3D point cloud models, yet common choices trade geometric fidelity for computational cost. Chamfer Distance is efficient but permits many-to-one correspondences, while Earth Mover Distance better reflects one-to-one transport at high computational cost. APML approximates transport with differentiable Sinkhorn iterations and an analytically derived temperature, but its dense formulation scales quadratically in memory. We present CUDA-APML, a sparse GPU implementation that thresholds negligible assignments and runs adaptive softmax, bidirectional symmetrization, and Sinkhorn normalization directly in COO form. This yields near-linear memory scaling and preserves gradients on the stored support, while pairwise distance evaluation remains quadratic in the current implementation. On ShapeNet and MM-Fi, CUDA-APML matches dense APML within a small tolerance while reducing peak GPU memory by 99.9%. Code available at: https://github.com/Multimodal-Sensing-Lab/apml
翻译:损失函数是学习精确三维点云模型的基础,然而常见选择往往以几何保真度换取计算成本。倒角距离计算高效但允许多对一对应,而推土机距离虽能更好地反映一对一传输,却需高昂计算代价。APML通过可微Sinkhorn迭代与解析推导的温度参数近似传输过程,但其稠密实现存在内存二次方扩展问题。本文提出CUDA-APML——一种稀疏GPU实现方案,通过阈值过滤可忽略的分配关系,在COO格式中直接运行自适应softmax、双向对称化及Sinkhorn归一化操作。该方法实现近线性内存扩展,并在存储支撑集上保持梯度可导性(当前实现中成对距离计算仍保持二次复杂度)。在ShapeNet和MM-Fi数据集上,CUDA-APML在微小容差范围内匹配稠密APML性能,同时将GPU峰值内存降低99.9%。代码发布于:https://github.com/Multimodal-Sensing-Lab/apml