As chip-to-chip silicon photonics gain traction for their bandwidth and energy efficiency, collective communication has emerged as a critical bottleneck in scale-up systems. Programmable photonic interconnects offer a promising path forward: by dynamically reconfiguring the fabric, they can establish direct, high-bandwidth optical paths between communicating endpoints -- \emph{synchronously and guided by the structure of collective operations} (e.g., AllReduce). However, realizing this vision -- \emph{when light bends to the collective will} -- requires navigating a fundamental trade-off between reconfiguration delay and the performance gains of adaptive topologies. In this paper, we present a simple theoretical framework for adaptive photonic scale-up domains that makes this trade-off explicit and clarifies when reconfiguration is worthwhile. Along the way, we highlight a connection -- not surprising but still powerful -- between the Birkhoff--von Neumann (BvN) decomposition, maximum concurrent flow (a classic measure of network throughput), and the well-known $\alpha$--$\beta$ cost model for collectives. Finally, we outline a research agenda in algorithm design and systems integration that can build on this foundation.
翻译:暂无翻译