With the rapid development of information technologies, centralized data processing is subject to many limitations, such as computational overheads, communication delays, and data privacy leakage. Decentralized data processing over networked terminal nodes becomes an important technology in the era of big data. Dictionary learning is a powerful representation learning method to exploit the low-dimensional structure from the high-dimensional data. By exploiting the low-dimensional structure, the storage and the processing overhead of data can be effectively reduced. In this paper, we propose a novel decentralized complete dictionary learning algorithm, which is based on $\ell^{4}$-norm maximization. Compared with existing decentralized dictionary learning algorithms, comprehensive numerical experiments show that the novel algorithm has significant advantages in terms of per-iteration computational complexity, communication cost, and convergence rate in many scenarios. Moreover, a rigorous theoretical analysis shows that the dictionaries learned by the proposed algorithm can converge to the one learned by a centralized dictionary learning algorithm at a linear rate with high probability under certain conditions.
翻译:随着信息技术的迅速发展,集中化数据处理受到许多限制,例如计算间接费用、通信延误和数据隐私泄漏。网络终端节点的分散化数据处理在大数据时代成为重要技术。词典学习是利用高维数据的低维结构的有力代表学习方法。通过利用低维结构,可以有效地减少数据储存和处理间接费用。在本文件中,我们建议采用新的分散化的全字典学习算法,该算法以$\ell ⁇ 4}$-norm 最大化为基础。与现有的分散化字典学习算法相比,综合数字实验表明,新算法在许多情景中,在逐字典计算复杂性、通信成本和汇合率方面有很大优势。此外,严格的理论分析表明,通过拟议的算法所学的词典可以与集中的词典学习算法所学的词法相融合,在一定条件下,以很高的线性速度进行线性学习。