To facilitate the emerging applications in the 5G networks and beyond, mobile network operators will provide many powerful control functionalities such as RAN slicing and resource scheduling. These control functionalities generally comprise a series of prediction tasks such as channel state information prediction, cellular traffic prediction and user mobility prediction which will be enabled by machine learning (ML) techniques. However, training the ML models offline is inefficient, due to the excessive overhead for forwarding the huge volume of data samples from cellular networks to remote ML training clouds. Thanks to the promising edge computing paradigm, we advocate cooperative online in-network ML training across edge clouds. To alleviate the data skew issue caused by the capacity heterogeneity and dynamics of edge clouds while avoiding excessive overhead, we propose Cocktail, a cost-efficient and data skew-aware online in-network distributed machine learning framework. We build a comprehensive model and formulate an online data scheduling problem to optimize the framework cost while reconciling the data skew from both short-term and long-term perspective. We exploit the stochastic gradient descent to devise an online asymptotically optimal algorithm. As its core building block, we propose optimal policies based on novel graph constructions to respectively solve two subproblems. We also improve the proposed online algorithm with online learning for fast convergence of in-network ML training. A small-scale testbed and large-scale simulations validate the superior performance of our framework.
翻译:为了便利5G网络内外新出现的应用,移动网络操作员将提供许多强大的控制功能,如RAN切片和资源调度等,这些控制功能通常包括一系列预测任务,如频道国家信息预测、蜂窝交通预测和用户流动预测,这些预测将由机器学习技术促成。然而,由于将大量来自蜂窝网络的数据样本传送到远程ML培训云的过度间接成本,对ML离线模型的培训效率低下,因此,由于将大量数据样本从蜂窝网络传送到远程ML培训云层,我们倡导在网络内进行跨边缘计算模式的合作。为了减轻由边缘云的能力异质和动态引起的数据尖锐问题,同时避免过度的间接损失,我们提议通过网络内部分配的机器学习框架,实现鸡尾、具有成本效益和数据觉悟的在线在线培训。我们建立一个全面的模型,并设计一个在线数据列表问题,以优化框架的成本,同时从短期和长期角度调和数据基边际计算模式。我们利用模拟梯度梯度下降来设计一个在线的优化的模型。作为核心结构的高级结构,我们提议在网络建设中分别改进双级的高级模型的在线测试模型,我们建议以进行双级的在线测试。