PageRank 算法的连续限制 (A continuum limit for the PageRank algorithm)

Semi-supervised and unsupervised machine learning methods often rely on graphs to model data, prompting research on how theoretical properties of operators on graphs are leveraged in learning problems. While most of the existing literature focuses on undirected graphs, directed graphs are very important in practice, giving models for physical, biological, or transportation networks, among many other applications. In this paper, we propose a new framework for rigorously studying continuum limits of learning algorithms on directed graphs. We use the new framework to study the PageRank algorithm, and show how it can be interpreted as a numerical scheme on a directed graph involving a type of normalized graph Laplacian. We show that the corresponding continuum limit problem, which is taken as the number of webpages grows to infinity, is a second-order, possibly degenerate, elliptic equation that contains reaction, diffusion, and advection terms. We prove that the numerical scheme is consistent and stable and compute explicit rates of convergence of the discrete solution to the solution of the continuum limit PDE. We give applications to proving stability and asymptotic regularity of the PageRank vector. Finally, we illustrate our results with numerical experiments and explore an application to data depth.

翻译：半监督和未经监督的机器学习方法往往依靠图表来模拟数据,促使研究如何在学习问题中利用图表操作者的理论特性。虽然大多数现有文献侧重于非定向图表,但定向图表在实践中非常重要,为物理、生物或运输网络提供了模型等许多其他应用。在本文件中,我们提议了一个新的框架,以严格研究定向图形上学习算法的连续性限制。我们使用新的框架来研究PageRank算法,并展示如何将其解读为在涉及一种普通图形 Laplacian类型的定向图形上的数字方案。我们展示了相应的连续性限制问题,即当网页数量增长到无限时,它是一个第二顺序,可能是退化的,椭圆方程式,包含反应、扩散和适应术语。我们证明数字方案是一致和稳定的,并且对离散溶方法与连续限制PageRank矢量的解决方案的一致率进行了计算。我们用应用来证明稳定的,并且对PageRank矢量的深度数据进行了精确的实验。最后,我们用数字实验和数字矢量来说明我们的结果。

相关内容

PageRank

关注 210

PageRank，网页排名，又称网页级别、Google左侧排名或佩奇排名，是一种由[1] 根据网页之间相互的超链接计算的技术，而作为网页排名的要素之一，以Google公司创办人拉里·佩奇（Larry Page）之姓来命名。Google用它来体现网页的相关性和重要性，在搜索引擎优化操作中是经常被用来评估网页优化的成效因素之一。Google的创始人拉里·佩奇和谢尔盖·布林于1998年在斯坦福大学发明了这项技术。

ICLR2021放榜了！ 687篇入选34篇得满分！ 48篇orals，108篇spotlights，531篇poster

专知会员服务

24+阅读 · 2021年1月13日

近期必读的 NeurIPS2020 80多篇【图机器学习】相关论文

专知会员服务

54+阅读 · 2020年11月3日

一份简单《图神经网络》教程，28页ppt

专知会员服务

126+阅读 · 2020年8月2日

【ICML2020】用于图结构化数据的卷积核网络，Convolutional Kernel Networks for Graph-Structured Data

专知会员服务

44+阅读 · 2020年6月29日