The inference of novel knowledge, the discovery of hidden patterns, and the uncovering of insights from large amounts of data from a multitude of sources make Data Science (DS) to an art rather than just a mere scientific discipline. The study and design of mathematical models able to analyze information represents a central research topic in DS. In this work, we introduce and investigate a novel model for influence maximization (IM) on graphs using ideas from kernel-based approximation, Gaussian process regression, and the minimization of a corresponding variance term. Data-driven approaches can be applied to determine proper kernels for this IM model and machine learning methodologies are adopted to tune the model parameters. Compared to stochastic models in this field that rely on costly Monte-Carlo simulations, our model allows for a simple and cost-efficient update strategy to compute optimal influencing nodes on a graph. In several numerical experiments, we show the properties and benefits of this new model.
翻译:新知识的推论、隐藏模式的发现以及从多种来源的大量数据中发现洞察力,使数据科学(DS)变成一种艺术,而不仅仅是一个科学学科。能够分析信息的数学模型的研究和设计是DS的一个中心研究课题。在这项工作中,我们引入并调查了利用内核近似、高斯进程回归和尽量减少相应差异术语的理念对图表进行最大化影响的新模型。可以应用数据驱动的方法来确定该IM模型的适当内核,并采用机器学习方法来调和模型参数。与该领域依赖昂贵的蒙特-卡洛模拟的随机模型相比,我们的模式允许一种简单和具有成本效益的更新战略来对图表上的节点进行优化影响。在几个数字实验中,我们展示了这一新模型的特性和效益。