With Reinforcement Learning we assume that a model of the world does exist. We assume furthermore that the model in question is perfect (i.e. it describes the world completely and unambiguously). This article will demonstrate that it does not make sense to search for the perfect model because this model is too complicated and practically impossible to find. We will show that we should abandon the pursuit of perfection and pursue Event-Driven (ED) models instead. These models are generalization of Markov Decision Process (MDP) models. This generalization is essential because nothing can be found without it. Rather than a single MDP, we will aim to find a raft of neat simple ED models each one describing a simple dependency or property. In other words, we will replace the search for a singular and complex perfect model with a search for a large number of simple models.
翻译:通过强化学习,我们假设世界的模型确实存在。我们进一步假设,有关模型是完美的(即它完全和毫不含糊地描述世界 ) 。 本条将表明,寻找完美模型是没有意义的,因为模型太复杂,几乎不可能找到。 我们将表明,我们应该放弃追求完美,而是追求事件驱动(ED)模型。 这些模型是Markov 决策程序(MDP)模型的概括化。 这种概括化之所以重要,是因为没有这种模型是找不到的。 我们的目标是找到一串简单简单的ED模型,每个模型描述一个简单的依赖性或财产。换句话说,我们将用寻找大量简单模型来取代寻找单一和复杂的完美模型。