动态信息共享和惩罚战略 (Dynamic Information Sharing and Punishment Strategies)

In this paper we study the problem of information sharing among rational self-interested agents as a dynamic game of asymmetric information. We assume that the agents imperfectly observe a Markov chain and they are called to decide whether they will share their noisy observations or not at each time instant. We utilize the notion of conditional mutual information to evaluate the information being shared among the agents. The challenges that arise due to the inter-dependence of agents' information structure and decision-making are exhibited. For the finite horizon game we prove that agents do not have incentive to share information. In contrast, we show that cooperation can be sustained in the infinite horizon case by devising appropriate punishment strategies which are defined over the agents' beliefs on the system state. We show that these strategies are closed under the best-response mapping and that cooperation can be the optimal choice in some subsets of the state belief simplex. We characterize these equilibrium regions, prove uniqueness of a maximal equilibrium region and devise an algorithm for its approximate computation.

翻译：在本文中,我们研究了理性的自我利益代理人之间的信息共享问题,认为这是一个动态的不对称信息游戏。我们假设代理人对马尔科夫链的观察不完美,他们被要求决定他们是否将随时分享其吵闹的观察结果;我们利用有条件的相互信息的概念来评价代理人之间共享的信息;由于代理人信息结构和决策的相互依存性而产生的挑战已经显现出来。对于有限的地平线游戏,我们证明代理人没有分享信息的动力。相反,我们表明,在无限的地平线情况下,合作可以持续,制定适当的惩罚战略,这些战略是针对代理人对系统状态的信仰确定的。我们表明,这些战略在最佳反应图绘制中是封闭的,合作可以成为国家信仰简单化的某些子中的最佳选择。我们将这些平衡区域定性为最独特的平衡区域,并设计出一种精确计算法。