Chapter 2: 线性代数 Chapter 3: 概率理论 Probability Theory Chapter 4: 概率分布 Probability Distributions Chapter 5: 凸优化 Convex Optimization Part two: 贝叶斯估计 Bayesian Estimation Chapter 6: 从数据中学习 Learning from Data Chapter 7: 马尔科夫链蒙特卡洛 Markov Chain Monte Carlo
Part three: 监督学习 Supervised Learning Chapter 8: 回归 Regression Chapter 9: 分类 Classification Part four: 无监督学习 Unsupervised Learning Chapter 10: 聚类 Clustering Chapter 11: 贝叶斯网络 Bayesian Networks Chapter 12: 状态空间 State-Space Models Chapter 13: 模型 Model Calibration Part five: 强化学习 Reinforcement Learning Chapter 14: 不确定上下文决策 Decision in Uncertain Contexts Chapter 15: 序列决策 Sequential Decisions

相关内容

https://www.springer.com/gp/book/9783030410674

http://mason.gmu.edu/~jgentle/books/MathStat.pdf

• Introduction
• Probability
• Generative models for discrete data
• Gaussian Models
• Bayesian statistics
• Frequentist statistics
• Linear Regression
• Logistic Regression
• Generalized linear models and the exponential family
• Directed graphical models(Bayes nets)
• Mixture models and the EM algorithm
• Latent linear models
• Sparse linear models
• Kernels
• Gaussian processes
• Hidden markov Model
• State space models
• Undirected graphical models(Markov random fields)
• Exact inference for graphical models
• Variational inference
• More variational inference
• Monte Carlo inference
• Markov chain Monte Carlo (MCMC)inference
• Clustering
• Graphical model structure learning
• Latent variable models for discrete data
• Deep learning

《数据科学与机器学习概论》的创建目标是为寻求了解数据科学的初学者、数据爱好者和经验丰富的数据专业人士提供从头到尾对使用开源编程进行数据科学应用开发的深刻理解。这本书分为四个部分: 第一部分包含对这本书的介绍，第二部分涵盖了数据科学、软件开发和基于开源嵌入式硬件的领域; 第三部分包括算法，是数据科学应用的决策引擎; 最后一节汇集了前三节中共享的概念，并提供了几个数据科学应用程序示例。

^

1. Introductory Chapter: Clustering with Nature-Inspired Optimization Algorithms 在本章中，读者将学习如何为聚类问题应用优化算法。

By Pakize Erdogmus and Fatih Kayaalp

1. Best Practices in Accelerating the Data Science Process in Python

By Deanne Larson

1. Software Design for Success By Laura M. Castro

1. Embedded Systems Based on Open Source Platforms By Zlatko Bundalo and Dusanka Bundalo

2. The K-Means Algorithm Evolution By Joaquín Pérez-Ortega, Nelva Nely Almanza-Ortega, Andrea Vega-Villalobos, Rodolfo Pazos-Rangel, Crispín Zavala-Díaz and Alicia Martínez-Rebollar

3. “Set of Strings” Framework for Big Data Modeling By Igor Sheremet

4. Investigation of Fuzzy Inductive Modeling Method in Forecasting Problems By Yu. Zaychenko and Helen Zaychenko

5. Segmenting Images Using Hybridization of K-Means and Fuzzy C-Means Algorithms By Raja Kishor Duggirala

6. The Software to the Soft Target Assessment By Lucia Mrazkova Duricova, Martin Hromada and Jan Mrazek

7. The Methodological Standard to the Assessment of the Traffic Simulation in Real Time By Jan Mrazek, Martin Hromada and Lucia Duricova Mrazkova

8. Augmented Post Systems: Syntax, Semantics, and Applications By Igor Sheremet

9. Serialization in Object-Oriented Programming Languages By Konrad Grochowski, Michał Breiter and Robert Nowak

https://mp.weixin.qq.com/s/xrUw_4IPI4BhYwHvjSuwzA

• 第一部分：列表（Tabular）解决法，第一章描述了强化学习问题具体案例的解决方案，第二章描述了贯穿全书的一般问题制定——有限马尔科夫决策过程，其主要思想包括贝尔曼方程（Bellman equation）和价值函数，第三、四、五章介绍了解决有限马尔科夫决策问题的三类基本方法：动态编程，蒙特卡洛方法、时序差分学习。三者各有其优缺点，第六、七章介绍了上述三类方法如何结合在一起进而达到最佳效果。第六章中介绍了可使用适合度轨迹（eligibility traces）把蒙特卡洛方法和时序差分学习的优势整合起来。第七章中表明时序差分学习可与模型学习和规划方法（比如动态编程）结合起来，获得一个解决列表强化学习问题的完整而统一的方案。

• 第二部分：近似求解法，从某种程度上讲只需要将强化学习方法和已有的泛化方法结合起来。泛化方法通常称为函数逼近，从理论上看，在这些领域中研究过的任何方法都可以用作强化学习算法中的函数逼近器，虽然实际上有些方法比起其它更加适用于强化学习。在强化学习中使用函数逼近涉及一些在传统的监督学习中不常出现的新问题，比如非稳定性（nonstationarity）、引导（bootstrapping）和目标延迟（delayed targets）。这部分的五章中先后介绍这些以及其它问题。首先集中讨论在线（on-policy）训练，而在第九章中的预测案例其策略是给定的，只有其价值函数是近似的，在第十章中的控制案例中最优策略的一个近似已经找到。第十一章讨论函数逼近的离线（off-policy）学习的困难。第十二章将介绍和分析适合度轨迹（eligibility traces）的算法机制，它能在多个案例中显著优化多步强化学习方法的计算特性。这一部分的最后一章将探索一种不同的控制、策略梯度的方法，它能直接逼近最优策略且完全不需要设定近似值函数（虽然如果使用了一个逼近价值函数，效率会高得多）。

• 第三部分：深层次研究，这部分把眼光放到第一、二部分中介绍标准的强化学习思想之外，简单地概述它们和心理学以及神经科学的关系，讨论一个强化学习应用的采样过程，和一些未来的强化学习研究的活跃前沿。

• 第一章 Introductory Chapter: Timeliness of Advantages of Bayesian Networks By Douglas S. McNair
• 第二章 An Economic Growth Model Using Hierarchical Bayesian Method By Nur Iriawan and Septia Devi Prihastuti Yasmirullah
• 第三章 Bayesian Networks for Decision-Making and Causal Analysis under Uncertainty in Aviation
• 第四章 Using Bayesian Networks for Risk Assessment in Healthcare System
• 第五章 Continuous Learning of the Structure of Bayesian Networks: A Mapping Study
• 第六章 Multimodal Bayesian Network for Artificial Perception
• 第七章 Quantitative Structure-Activity Relationship Modeling and Bayesian Networks: Optimality of Naive Bayes Model
• 第八章 Bayesian Graphical Model Application for Monetary Policy and Macroeconomic Performance in Nigeria

目录

Part I: 数据基础

• Introduction and Motivation
• Linear Algebra
• Analytic Geometry
• Matrix Decompositions
• Vector Calculus
• Probability and Distribution
• Continuous Optimization

Part II: 机器学习问题

• When Models Meet Data
• Linear Regression
• Dimensionality Reduction with Principal Component Analysis
• Density Estimation with Gaussian Mixture Models
• Classification with Support Vector Machines

The tutorial is written for those who would like an introduction to reinforcement learning (RL). The aim is to provide an intuitive presentation of the ideas rather than concentrate on the deeper mathematics underlying the topic. RL is generally used to solve the so-called Markov decision problem (MDP). In other words, the problem that you are attempting to solve with RL should be an MDP or its variant. The theory of RL relies on dynamic programming (DP) and artificial intelligence (AI). We will begin with a quick description of MDPs. We will discuss what we mean by “complex” and “large-scale” MDPs. Then we will explain why RL is needed to solve complex and large-scale MDPs. The semi-Markov decision problem (SMDP) will also be covered.

The tutorial is meant to serve as an introduction to these topics and is based mostly on the book: “Simulation-based optimization: Parametric Optimization techniques and reinforcement learning” [4]. The book discusses this topic in greater detail in the context of simulators. There are at least two other textbooks that I would recommend you to read: (i) Neuro-dynamic programming [2] (lots of details on convergence analysis) and (ii) Reinforcement Learning: An Introduction [11] (lots of details on underlying AI concepts). A more recent tutorial on this topic is [8]. This tutorial has 2 sections: • Section 2 discusses MDPs and SMDPs. • Section 3 discusses RL. By the end of this tutorial, you should be able to • Identify problem structures that can be set up as MDPs / SMDPs. • Use some RL algorithms.

MIT新书《强化学习与最优控制》，REINFORCEMENT LEARNING AND OPTIMAL CONTROL https://web.mit.edu/dimitrib/www/Slides_Lecture13_RLOC.pdf https://web.mit.edu/dimitrib/www/RLbook.html

56+阅读 · 2020年12月6日

52+阅读 · 2020年11月20日

45+阅读 · 2020年9月14日

83+阅读 · 2019年12月9日

58+阅读 · 2019年10月11日

128+阅读 · 2019年10月9日

Andrea Cappozzo,Francesca Greselin,Thomas Brendan Murphy
0+阅读 · 2020年12月15日
Emmanuel Bengio,Joelle Pineau,Doina Precup
7+阅读 · 2020年3月13日
Noam Shazeer,Zhenzhong Lan,Youlong Cheng,Nan Ding,Le Hou
12+阅读 · 2020年3月5日
Constantinos Papayiannis,Christine Evers,Patrick A. Naylor
4+阅读 · 2019年1月10日
Garrett Wilson,Diane J. Cook
10+阅读 · 2018年12月6日
Jasper van der Waa,Jurriaan van Diggelen,Karel van den Bosch,Mark Neerincx
4+阅读 · 2018年7月23日
Jianxin Lin,Yingce Xia,Tao Qin,Zhibo Chen,Tie-Yan Liu
5+阅读 · 2018年5月1日
Wenhao Jiang,Lin Ma,Xinpeng Chen,Hanwang Zhang,Wei Liu
6+阅读 · 2018年4月3日
Top