The Elo rating system has been used world wide for individual sports and team sports, as exemplified by the European Go Federation (EGF), International Chess Federation (FIDE), International Federation of Association Football (FIFA), and many others. To evaluate the performance of artificial intelligence agents, it is natural to evaluate them on the same Elo scale as humans, such as the rating of 5185 attributed to AlphaGo Zero. There are several fundamental differences between humans and AI that suggest modifications to the system, which in turn require revisiting Elo's fundamental rationale. AI is typically trained on many more games than humans play, and we have little a-priori information on newly created AI agents. Further, AI is being extended into games which are asymmetric between the players, and which could even have large complex boards with different setup in every game, such as commercial paper strategy games. We present a revised rating system, and guidelines for tournaments, to reflect these differences.

### 相关内容

Machine-generated artworks are now part of the contemporary art scene: they are attracting significant investments and they are presented in exhibitions together with those created by human artists. These artworks are mainly based on generative deep learning techniques. Also given their success, several legal problems arise when working with these techniques. In this article we consider a set of key questions in the area of generative deep learning for the arts. Is it possible to use copyrighted works as training set for generative models? How do we legally store their copies in order to perform the training process? And then, who (if someone) will own the copyright on the generated data? We try to answer these questions considering the law in force in both US and EU and the future alternatives, trying to define a set of guidelines for artists and developers working on deep learning generated art.

This paper provides a novel approach to stitching surface images of rotationally symmetric parts. It presents a process pipeline that uses a feature-based stitching approach to create a distortion-free and true-to-life image from a video file. The developed process thus enables, for example, condition monitoring without having to view many individual images. For validation purposes, this will be demonstrated in the paper using the concrete example of a worn ball screw drive spindle. The developed algorithm aims at reproducing the functional principle of a line scan camera system, whereby the physical measuring systems are replaced by a feature-based approach. For evaluation of the stitching algorithms, metrics are used, some of which have only been developed in this work or have been supplemented by test procedures already in use. The applicability of the developed algorithm is not only limited to machine tool spindles. Instead, the developed method allows a general approach to the surface inspection of various rotationally symmetric components and can therefore be used in a variety of industrial applications. Deep-learning-based detection Algorithms can easily be implemented to generate a complete pipeline for failure detection and condition monitoring on rotationally symmetric parts.

Online gaming is growing faster than ever before, with increasing challenges of providing better user experience. Recommender systems (RS) for online games face unique challenges since they must fulfill players' distinct desires, at different user levels, based on their action sequences of various action types. Although many sequential RS already exist, they are mainly single-sequence, single-task, and single-user-level. In this paper, we introduce a new sequential recommendation model for multiple sequences, multiple tasks, and multiple user levels (abbreviated as M$^3$Rec) in Tencent Games platform, which can fully utilize complex data in online games. We leverage Graph Neural Network and multi-task learning to design M$^3$Rec in order to model the complex information in the heterogeneous sequential recommendation scenario of Tencent Games. We verify the effectiveness of M$^3$Rec on three online games of Tencent Games platform, in both offline and online evaluations. The results show that M$^3$Rec successfully addresses the challenges of recommendation in online games, and it generates superior recommendations compared with state-of-the-art sequential recommendation approaches.

This paper reviews the current state of the art in Artificial Intelligence (AI) technologies and applications in the context of the creative industries. A brief background of AI, and specifically Machine Learning (ML) algorithms, is provided including Convolutional Neural Network (CNNs), Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement Learning (DRL). We categorise creative applications into five groups related to how AI technologies are used: i) content creation, ii) information analysis, iii) content enhancement and post production workflows, iv) information extraction and enhancement, and v) data compression. We critically examine the successes and limitations of this rapidly advancing technology in each of these areas. We further differentiate between the use of AI as a creative tool and its potential as a creator in its own right. We foresee that, in the near future, machine learning-based AI will be adopted widely as a tool or collaborative assistant for creativity. In contrast, we observe that the successes of machine learning in domains with fewer constraints, where AI is the `creator', remain modest. The potential of AI (or its developers) to win awards for its original creations in competition with human creatives is also limited, based on contemporary technologies. We therefore conclude that, in the context of creative industries, maximum benefit from AI will be derived where its focus is human centric -- where it is designed to augment, rather than replace, human creativity.

In this paper, we introduce a novel Generic distributEd clustEring frameworK (GEEK) beyond $k$-means clustering to process massive amounts of data. To deal with different data types, GEEK first converts data in the original feature space into a unified format of buckets; then, we design a new Seeding method based on simILar bucKets (SILK) to determine initial seeds. Compared with state-of-the-art seeding methods such as $k$-means++ and its variants, SILK can automatically identify the number of initial seeds based on the closeness of shared data objects in similar buckets instead of pre-specifying $k$. Thus, its time complexity is independent of $k$. With these well-selected initial seeds, GEEK only needs a one-pass data assignment to get the final clusters. We implement GEEK on a distributed CPU-GPU platform for large-scale clustering. We evaluate the performance of GEEK over five large-scale real-life datasets and show that GEEK can deal with massive data of different types and is comparable to (or even better than) many state-of-the-art customized GPU-based methods, especially in large $k$ values.

Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on \emph{determinantal point processes} (DPP). By incorporating the diversity metric into best-response dynamics, we develop \emph{diverse fictitious play} and \emph{diverse policy-space response oracle} for solving normal-form games and open-ended games. We prove the uniqueness of the diverse best response and the convergence of our algorithms on two-player games. Importantly, we show that maximising the DPP-based diversity metric guarantees to enlarge the \emph{gamescape} -- convex polytopes spanned by agents' mixtures of strategies. To validate our diversity-aware solvers, we test on tens of games that show strong non-transitivity. Results suggest that our methods achieve much lower exploitability than state-of-the-art solvers by finding effective and diverse strategies.

Pictures of everyday life are inherently multi-label in nature. Hence, multi-label classification is commonly used to analyze their content. In typical multi-label datasets, each picture contains only a few positive labels, and many negative ones. This positive-negative imbalance can result in under-emphasizing gradients from positive labels during training, leading to poor accuracy. In this paper, we introduce a novel asymmetric loss ("ASL"), that operates differently on positive and negative samples. The loss dynamically down-weights the importance of easy negative samples, causing the optimization process to focus more on the positive samples, and also enables to discard mislabeled negative samples. We demonstrate how ASL leads to a more "balanced" network, with increased average probabilities for positive samples, and show how this balanced network is translated to better mAP scores, compared to commonly used losses. Furthermore, we offer a method that can dynamically adjust the level of asymmetry throughout the training. With ASL, we reach new state-of-the-art results on three common multi-label datasets, including achieving 86.6% on MS-COCO. We also demonstrate ASL applicability for other tasks such as fine-grain single-label classification and object detection. ASL is effective, easy to implement, and does not increase the training time or complexity. Implementation is available at: https://github.com/Alibaba-MIIL/ASL.

User engagement is a critical metric for evaluating the quality of open-domain dialogue systems. Prior work has focused on conversation-level engagement by using heuristically constructed features such as the number of turns and the total time of the conversation. In this paper, we investigate the possibility and efficacy of estimating utterance-level engagement and define a novel metric, {\em predictive engagement}, for automatic evaluation of open-domain dialogue systems. Our experiments demonstrate that (1) human annotators have high agreement on assessing utterance-level engagement scores; (2) conversation-level engagement scores can be predicted from properly aggregated utterance-level engagement scores. Furthermore, we show that the utterance-level engagement scores can be learned from data. These scores can improve automatic evaluation metrics for open-domain dialogue systems, as shown by correlation with human judgements. This suggests that predictive engagement can be used as a real-time feedback for training better dialogue models.

BERT-based architectures currently give state-of-the-art performance on many NLP tasks, but little is known about the exact mechanisms that contribute to its success. In the current work, we focus on the interpretation of self-attention, which is one of the fundamental underlying components of BERT. Using a subset of GLUE tasks and a set of handcrafted features-of-interest, we propose the methodology and carry out a qualitative and quantitative analysis of the information encoded by the individual BERT's heads. Our findings suggest that there is a limited set of attention patterns that are repeated across different heads, indicating the overall model overparametrization. While different heads consistently use the same attention patterns, they have varying impact on performance across different tasks. We show that manually disabling attention in certain heads leads to a performance improvement over the regular fine-tuned BERT models.

Recommender systems play a crucial role in mitigating the problem of information overload by suggesting users' personalized items or services. The vast majority of traditional recommender systems consider the recommendation procedure as a static process and make recommendations following a fixed strategy. In this paper, we propose a novel recommender system with the capability of continuously improving its strategies during the interactions with users. We model the sequential interactions between users and a recommender system as a Markov Decision Process (MDP) and leverage Reinforcement Learning (RL) to automatically learn the optimal strategies via recommending trial-and-error items and receiving reinforcements of these items from users' feedbacks. In particular, we introduce an online user-agent interacting environment simulator, which can pre-train and evaluate model parameters offline before applying the model online. Moreover, we validate the importance of list-wise recommendations during the interactions between users and agent, and develop a novel approach to incorporate them into the proposed framework LIRD for list-wide recommendations. The experimental results based on a real-world e-commerce dataset demonstrate the effectiveness of the proposed framework.

Giorgio Franceschelli,Mirco Musolesi
0+阅读 · 6月21日
Tobias Schlagenhauf,Tim Brander,Juergen Fleischer
0+阅读 · 6月21日
Nantheera Anantrasirichai,David Bull
1+阅读 · 6月19日
Pingyi Luo,Qiang Huang,Anthony K. H. Tung
0+阅读 · 6月19日
Nicolas Perez Nieves,Yaodong Yang,Oliver Slumbers,David Henry Mguni,Jun Wang
5+阅读 · 3月14日
Emanuel Ben-Baruch,Tal Ridnik,Nadav Zamir,Asaf Noy,Itamar Friedman,Matan Protter,Lihi Zelnik-Manor
6+阅读 · 2020年9月29日
Sarik Ghazarian,Ralph Weischedel,Aram Galstyan,Nanyun Peng
11+阅读 · 2019年11月4日
Olga Kovaleva,Alexey Romanov,Anna Rogers,Anna Rumshisky
4+阅读 · 2019年9月11日
Xiangyu Zhao,Liang Zhang,Zhuoye Ding,Dawei Yin,Yihong Zhao,Jiliang Tang
12+阅读 · 2018年1月5日

78+阅读 · 2020年8月7日

38+阅读 · 2020年7月26日

56+阅读 · 2020年5月19日

59+阅读 · 2020年5月15日

24+阅读 · 2019年10月17日

62+阅读 · 2019年10月11日

5+阅读 · 2019年3月22日
CreateAMind
28+阅读 · 2019年1月3日

10+阅读 · 2018年12月24日
LibRec智能推荐
11+阅读 · 2018年11月29日
LibRec智能推荐
3+阅读 · 2018年8月9日
Call4Papers
4+阅读 · 2018年3月13日

4+阅读 · 2018年1月30日

4+阅读 · 2017年11月28日
CreateAMind
5+阅读 · 2017年8月4日
Top