人工智能(Artificial Intelligence, AI )是研究、开发用于模拟、延伸和扩展人的智能的理论、方法、技术及应用系统的一门新的技术科学。 人工智能是计算机科学的一个分支。

    State-of-the-art deep learning methods have shown a remarkable capacity to model complex data domains, but struggle with geospatial data. In this paper, we introduce SpaceGAN, a novel generative model for geospatial domains that learns neighbourhood structures through spatial conditioning. We propose to enhance spatial representation beyond mere spatial coordinates, by conditioning each data point on feature vectors of its spatial neighbours, thus allowing for a more flexible representation of the spatial structure. To overcome issues of training convergence, we employ a metric capturing the loss in local spatial autocorrelation between real and generated data as stopping criterion for SpaceGAN parametrization. This way, we ensure that the generator produces synthetic samples faithful to the spatial patterns observed in the input. SpaceGAN is successfully applied for data augmentation and outperforms compared to other methods of synthetic spatial data generation. Finally, we propose an ensemble learning framework for the geospatial domain, taking augmented SpaceGAN samples as training data for a set of ensemble learners. We empirically show the superiority of this approach over conventional ensemble learning approaches and rivaling spatial data augmentation methods, using synthetic and real-world prediction tasks. Our findings suggest that SpaceGAN can be used as a tool for (1) artificially inflating sparse geospatial data and (2) improving generalization of geospatial models.

    点赞 0
    阅读0+

    The paper studies how to release data about a critical infrastructure network (e.g., the power network or a transportation network) without disclosing sensitive information that can be exploited by malevolent agents, while preserving the realism of the network. It proposes a novel obfuscation mechanism that combines several privacy-preserving building blocks with a bi-level optimization model to significantly improve accuracy. The obfuscation is evaluated for both realism and privacy properties on real energy and transportation networks. Experimental results show the obfuscation mechanism substantially reduces the potential damage of an attack exploiting the released data to harm the real network.

    点赞 0
    阅读0+

    Over the past decade, artificial intelligence has demonstrated its efficiency in many different applications and a huge number of algorithms have become central and ubiquitous in our life. Their growing interest is essentially based on their capability to synthesize and process large amounts of data, and to help humans making decisions in a world of increasing complexity. Yet, the effectiveness of algorithms in bringing more and more relevant recommendations to humans may start to compete with human-alone decisions based on values other than pure efficacy. Here, we examine this tension in light of the emergence of several forms of digital normativity, and analyze how this normative role of AI may influence the ability of humans to remain subject of their life. The advent of AI technology imposes a need to achieve a balance between concrete material progress and progress of the mind to avoid any form of servitude. It has become essential that an ethical reflection accompany the current developments of intelligent algorithms beyond the sole question of their social acceptability. Such reflection should be anchored where AI technologies are being developed as well as in educational programs where their implications can be explained.

    点赞 0
    阅读0+

    Recent progress in machine learning techniques have revived interest in building artificial general intelligence using these particular tools. There has been a tremendous success in applying them for narrow intellectual tasks such as pattern recognition, natural language processing and playing Go. The latter application vastly outperforms the strongest human player in recent years. However, these tasks are formalized by people in such ways that it has become "easy" for automated recipes to find better solutions than humans do. In the sense of John Searle's Chinese Room Argument, the computer playing Go does not actually understand anything from the game. Thinking like a human mind requires to go beyond the curve fitting paradigm of current systems. There is a fundamental limit to what they can achieve currently as only very specific problem formalization can increase their performances in particular tasks. In this paper, we argue than one of the most important aspects of the human mind is its capacity for logical thinking, which gives rise to many intellectual expressions that differentiate us from animal brains. We propose to model the emergence of logical thinking based on Piaget's theory of cognitive development.

    点赞 0
    阅读1+

    Deep neural networks have obtained astounding successes for important pattern recognition tasks, but they suffer from high computational complexity and the lack of interpretability. The recent Tsetlin Machine (TM) attempts to address this lack by using easy-to-interpret conjunctive clauses in propositional logic to solve complex pattern recognition problems. The TM provides competitive accuracy in several benchmarks, while keeping the important property of interpretability. It further facilitates hardware-near implementation since inputs, patterns, and outputs are expressed as bits, while recognition and learning rely on straightforward bit manipulation. In this paper, we exploit the TM paradigm by introducing the Convolutional Tsetlin Machine (CTM), as an interpretable alternative to convolutional neural networks (CNNs). Whereas the TM categorizes an image by employing each clause once to the whole image, the CTM uses each clause as a convolution filter. That is, a clause is evaluated multiple times, once per image patch taking part in the convolution. To make the clauses location-aware, each patch is further augmented with its coordinates within the image. The output of a convolution clause is obtained simply by ORing the outcome of evaluating the clause on each patch. In the learning phase of the TM, clauses that evaluate to 1 are contrasted against the input. For the CTM, we instead contrast against one of the patches, randomly selected among the patches that made the clause evaluate to 1. Accordingly, the standard Type I and Type II feedback of the classic TM can be employed directly, without further modification. The CTM obtains a peak test accuracy of 99.51% on MNIST, 96.21% on Kuzushiji-MNIST, 89.56% on Fashion-MNIST, and 100.0% on the 2D Noisy XOR Problem, which is competitive with results reported for simple 4-layer CNNs, BinaryConnect, and a recent FPGA-accelerated Binary CNN.

    点赞 0
    阅读0+

    Reinforcement learning is an appropriate and successful method to robustly perform low-level robot control under noisy conditions. Symbolic action planning is useful to resolve causal dependencies and to break a causally complex problem down into a sequence of simpler high-level actions. A problem with the integration of both approaches is that action planning is based on discrete high-level action- and state spaces, whereas reinforcement learning is usually driven by a continuous reward function. However, recent advances in reinforcement learning, specifically, universal value function approximators and hindsight experience replay, have focused on goal-independent methods based on sparse rewards. In this article, we build on these novel methods to facilitate the integration of action planning with reinforcement learning by exploiting the reward-sparsity as a bridge between the high-level and low-level state- and control spaces. As a result, we demonstrate that the integrated neuro-symbolic method is able to solve object manipulation problems that involve tool use and non-trivial causal dependencies under noisy conditions, exploiting both data and knowledge.

    点赞 0
    阅读0+

    We focus on the important problem of emergency evacuation, which clearly could benefit from reinforcement learning that has been largely unaddressed. Emergency evacuation is a complex task which is difficult to solve with reinforcement learning, since an emergency situation is highly dynamic, with a lot of changing variables and complex constraints that makes it difficult to train on. In this paper, we propose the first fire evacuation environment to train reinforcement learning agents for evacuation planning. The environment is modelled as a graph capturing the building structure. It consists of realistic features like fire spread, uncertainty and bottlenecks. We have implemented the environment in the OpenAI gym format, to facilitate future research. We also propose a new reinforcement learning approach that entails pretraining the network weights of a DQN based agents to incorporate information on the shortest path to the exit. We achieved this by using tabular Q-learning to learn the shortest path on the building model's graph. This information is transferred to the network by deliberately overfitting it on the Q-matrix. Then, the pretrained DQN model is trained on the fire evacuation environment to generate the optimal evacuation path under time varying conditions. We perform comparisons of the proposed approach with state-of-the-art reinforcement learning algorithms like PPO, VPG, SARSA, A2C and ACKTR. The results show that our method is able to outperform state-of-the-art models by a huge margin including the original DQN based models. Finally, we test our model on a large and complex real building consisting of 91 rooms, with the possibility to move to any other room, hence giving 8281 actions. We use an attention based mechanism to deal with large action spaces. Our model achieves near optimal performance on the real world emergency environment.

    点赞 0
    阅读0+

    A common strategy to deal with the expensive reinforcement learning (RL) of complex tasks is to decompose them into a collection of subtasks that are usually simpler to learn as well as reusable for new problems. However, when a robot learns the policies for these subtasks, common approaches treat every policy learning process separately. Therefore, all these individual (composable) policies need to be learned before tackling the learning process of the complex task through policies composition. Such composition of individual policies is usually performed sequentially, which is not suitable for tasks that require to perform the subtasks concurrently. In this paper, we propose to combine a set of composable Gaussian policies corresponding to these subtasks using a set of activation vectors, resulting in a complex Gaussian policy that is a function of the means and covariances matrices of the composable policies. Moreover, we propose an algorithm for learning both compound and composable policies within the same learning process by exploiting the off-policy data generated from the compound policy. The algorithm is built on a maximum entropy RL approach to favor exploration during the learning process. The results of the experiments show that the experience collected with the compound policy permits not only to solve the complex task but also to obtain useful composable policies that successfully perform in their respective tasks. Supplementary videos and code are available at https://sites.google.com/view/hrl-concurrent-discovery .

    点赞 0
    阅读0+

    This paper demonstrates a novel method for separately estimating aleatoric risk and epistemic uncertainty in deep reinforcement learning. Aleatoric risk, which arises from inherently stochastic environments or agents, must be accounted for in the design of risk-sensitive algorithms. Epistemic uncertainty, which stems from limited data, is important both for risk-sensitivity and to efficiently explore an environment. We first present a Bayesian framework for learning the return distribution in reinforcement learning, which provides theoretical foundations for quantifying both types of uncertainty. Based on this framework, we show that the disagreement between only two neural networks is sufficient to produce a low-variance estimate of the epistemic uncertainty on the return distribution, thus providing a simple and computationally cheap uncertainty metric. We demonstrate experiments that illustrate our method and some applications.

    点赞 0
    阅读0+

    Automatic generation of level maps is a popular form of automatic content generation. In this study, a recently developed technique employing the {\em do what's possible} representation is used to create open-ended level maps. Generation of the map can continue indefinitely, yielding a highly scalable representation. A parameter study is performed to find good parameters for the evolutionary algorithm used to locate high-quality map generators. Variations on the technique are presented, demonstrating its versatility, and an algorithmic variant is given that both improves performance and changes the character of maps located. The ability of the map to adapt to different regions where the map is permitted to occupy space are also tested.

    点赞 0
    阅读0+

    Continuous queries over data streams may suffer from blocking operations and/or unbound wait, which may delay answers until some relevant input arrives through the data stream. These delays may turn answers, when they arrive, obsolete to users who sometimes have to make decisions with no help whatsoever. Therefore, it can be useful to provide hypothetical answers - "given the current information, it is possible that X will become true at time t" - instead of no information at all. In this paper we present a semantics for queries and corresponding answers that covers such hypothetical answers, together with an online algorithm for updating the set of facts that are consistent with the currently available information.

    点赞 0
    阅读0+

    Auction is the common paradigm for resource allocation which is a fundamental problem in human society. Existing research indicates that the two primary objectives, the seller's revenue and the allocation efficiency, are generally conflicting in auction design. For the first time, we expand the domain of the classic auction to a social graph and formally identify a new class of auction mechanisms on graphs. All mechanisms in this class are incentive-compatible and also promote all buyers to diffuse the auction information to others, whereby both the seller's revenue and the allocation efficiency are significantly improved comparing with the Vickrey auction. It is found that the recently proposed information diffusion mechanism is an extreme case with the lowest revenue in this new class. Our work could potentially inspire a new perspective for the efficient and optimal auction design and could be applied into the prevalent online social and economic networks.

    点赞 0
    阅读0+

    We propose a new objective, the counterfactual objective, unifying existing objectives for off-policy policy gradient algorithms in the continuing reinforcement learning (RL) setting. Compared to the commonly used excursion objective, which can be misleading about the performance of the target policy when deployed, our new objective better predicts such performance. We prove the Generalized Off-Policy Policy Gradient Theorem to compute the policy gradient of the counterfactual objective and use an emphatic approach to get an unbiased sample from this policy gradient, yielding the Generalized Off-Policy Actor-Critic (Geoff-PAC) algorithm. We demonstrate the merits of Geoff-PAC over existing algorithms in Mujoco robot simulation tasks, the first empirical success of emphatic algorithms in prevailing deep RL benchmarks.

    点赞 0
    阅读0+

    Predictive process monitoring is a family of techniques to analyze events produced during the execution of a business process in order to predict the future state or the final outcome of running process instances. Existing techniques in this field are able to predict, at each step of a process instance, the likelihood that it will lead to an undesired outcome.These techniques, however, focus on generating predictions and do not prescribe when and how process workers should intervene to decrease the cost of undesired outcomes. This paper proposes a framework for prescriptive process monitoring, which extends predictive monitoring with the ability to generate alarms that trigger interventions to prevent an undesired outcome or mitigate its effect. The framework incorporates a parameterized cost model to assess the cost-benefit trade-off of generating alarms. We show how to optimize the generation of alarms given an event log of past process executions and a set of cost model parameters. The proposed approaches are empirically evaluated using a range of real-life event logs. The experimental results show that the net cost of undesired outcomes can be minimized by changing the threshold for generating alarms, as the process instance progresses. Moreover, introducing delays for triggering alarms, instead of triggering them as soon as the probability of an undesired outcome exceeds a threshold, leads to lower net costs.

    点赞 0
    阅读0+

    In this work we describe a new learning-based proof guidance -- ENIGMAWatch -- for saturation-style first-order theorem provers. ENIGMAWatch combines two guiding approaches for the given-clause selection implemented for the E ATP system: ProofWatch and ENIGMA. ProofWatch is motivated by the watchlist (hints) method and based on symbolic matching of multiple related proofs, while ENIGMA is based on statistical machine learning. The two methods are combined by using the evolving information about symbolic proof matching as an additional information that characterizes the saturation-style proof search for the statistical learning methods. The new system is experimentally evaluated on a large set of problems from the Mizar Library. We show that the added proof-matching information is considered important by the statistical machine learners, and that it leads to improvements in E's Performance over ProofWatch and ENIGMA.

    点赞 0
    阅读0+
Top