强盗MF:多武装强盗基矩阵要素化建议系统 (BanditMF: Multi-Armed Bandit Based Matrix Factorization Recommender System)

Multi-armed bandits (MAB) provide a principled online learning approach to attain the balance between exploration and exploitation.Due to the superior performance and low feedback learning without the learning to act in multiple situations, Multi-armed Bandits drawing widespread attention in applications ranging such as recommender systems. Likewise, within the recommender system, collaborative filtering (CF) is arguably the earliest and most influential method in the recommender system. Crucially, new users and an ever-changing pool of recommended items are the challenges that recommender systems need to address. For collaborative filtering, the classical method is training the model offline, then perform the online testing, but this approach can no longer handle the dynamic changes in user preferences which is the so-called \textit{cold start}. So how to effectively recommend items to users in the absence of effective information? To address the aforementioned problems, a multi-armed bandit based collaborative filtering recommender system has been proposed, named BanditMF. BanditMF is designed to address two challenges in the multi-armed bandits algorithm and collaborative filtering: (1) how to solve the cold start problem for collaborative filtering under the condition of scarcity of valid information, (2) how to solve the sub-optimal problem of bandit algorithms in strong social relations domains caused by independently estimating unknown parameters associated with each user and ignoring correlations between users.

翻译：多武装土匪(MAB)提供了一种原则性在线学习方法,以实现勘探与开发之间的平衡。由于业绩优异,反馈学习低,而没有学习如何在多种情况下采取行动,多武装匪徒在建议系统等应用程序中引起广泛关注。同样,在推荐者系统中,合作过滤(CF)可以说是推荐者系统中最早和最有影响力的方法。关键的是,新用户和不断变化的推荐项目库是建议者系统需要应对的挑战。对于协作过滤系统来说,传统方法是培训模型脱机,然后进行在线测试,但这一方法不再能够处理用户偏好的动态变化,即所谓的“Textit{cold start} 。在缺乏有效信息的情况下,如何有效地向用户推荐项目?为了解决上述问题,提出了以多武装组成的协作过滤建议系统,名为“BanditMF”。 BanditMF旨在解决多武装土匪算法和协作过滤中的两个挑战:(1) 如何解决合作过滤的冷开始问题,即用户偏好地过滤问题,因为每个用户之间缺乏可靠的正统关系。(2) 如何通过独立地估算系统,解决了每个用户之间缺乏可靠的比例关系,如何解决了实际关系。