深度学习资源汇总

BigQuant.com 让每个投资者用上AI

机器学习作为最能够体现人工智能的分支，近几年取得了快速的发展。与此同时，人工智能正以颠覆性的方式影响这个世界，而在背后推动这场进步的，正是深度学习。前几日，AlphaGo以3:0的成绩大胜柯洁再次将深度学习的讨论推向了新的高度。本文整理了深度学习的相关学习资源，奉献给在深度学习道路上前行的小伙伴。

深度学习是一门实践学科，只有不断做实验才能有所进步。BigQuant集成了众多深度学习/机器学习开源框架，包括TensorFlow、Keras、XGBoost、Theano、Caffe等，可以直接在BigQuant上开启你的深度学习之旅！

简介

深度学习（Deep Learning）是机器学习（Machine Learning）的一个分支，一般指代“深度神经网络”（Deep Neural Network）。历史上，人工神经网络（Artificial Neural Networks）经历了三次发展浪潮：20世纪40年代到60年代，神经网络以“控制论”（cybernetics）闻名；20世纪80年代到90年代，表现为“联结主义”（connectionism）；2006年至今，以“深度学习”之名复兴。得益于与日俱增的数据量和计算能力（GPU, TPU，深度学习已经成功应用于计算机视觉、语音识别、自然语言处理、推荐系统等领域。

框架（排名不分先后）

BigQuant

之所以将BigQuant放在深度学习框架部分，是因为BigQuant集合了包括TensorFlow、Theano、XgBoost等在内的多个深度学习开源框架，即用户只要在BigQuant研究平台，不需要安装这些框架，直接可以使用。

TensorFlow

TensorFlow是谷歌开源的机器学习框架，提供可靠的python和C++接口，Go和Java的接口仍在开发中。

Theano

Theano是元老级的深度学习框架，提供python接口，具有良好的计算图抽象方式.

Torch

Torch是用Lua语言编写的计算框架，有很多已定义模型，容易编写自己的层类型并在GPU上运行，但不适合RNN。

Caffe

Caffe是应用较为广泛的机器视觉库，最适合于图像处理，不适合文本、声音等数据类型的深度学习应用.

CNKT

CNTK是微软开源的深度学习框架，主要包括前馈DNN、卷积网络和循环网络。

MXNet

MXNet是陈天奇等开发的深度学习框架，目前已被亚马逊选中成为其深度学习平台。特点是运行速度快，内存管理效率高。

Keras

Keras是一个高级神经网络API，用Python编写，能够运行在TensorFlow或者Theano之上。它的开发重点是实现快速实验。能够从理念到结果尽可能的延迟是做好研究的关键。

XGBoost

XGBoost是一个开源软件库，为C++、Java、Python、R和Julia提供了梯度提升框架。它适用于Linux、Windows和macOS。

Pylearn2

Pylearn2 是一个机器学习库。它的大部分功能是建立在Theano的基础之上。这意味着您可以使用数学表达式编写Pylearn2插件（新模型，算法等），Theano将为您优化和稳定这些表达式，并将其编译为您选择的后端（CPU或GPU）。

Chainer

Chainer 一个用于深度学习模型的基于Python的独立开源框架。 Chainer提供灵活，直观和高性能的方法来实现全范围的深度学习模型，包括最先进的模型，如复现神经网络和变分自动编码器。

Neon

Neon是Nervana的基于Python的深度学习库。 Nervana为金融机构提供了部署深度学习作为核心技术的完整解决方案。

neuraltalk

NeuralTalk是一个Python + numpy项目，用于学习用语言描述图像的多模态循环神经网络。

在线书籍

Deep Learning by Yoshua Bengio, Ian Goodfellow and Aaron Courville

介绍：深度学习的一本教科书，知识面很全，书籍同时兼顾广度和深度，是很多深度学习系统化学习的参考教材。

Neural networks and deep learning by Michael Nielsen

介绍：这是一本免费的在线书籍，主要介绍了神经网络和深度学习背后的核心概念。

Deep Learning: Methods and Applications - Microsoft Research Microsoft Research (2013)

介绍：本书旨在提供一般深度学习方法及其应用于各种信号和信息处理任务的概述，介绍了深度学习在语言、文字处理、信息检索、计算机视觉领域的具体运用。

Deep Learning Tutorial LISA lab, University of Montreal (Jan 6 2015)

介绍：蒙特利尔大学LISA实验室深度学习的教材，对卷积神经网络和LSTM、RNN等神经网络进行了具体的介绍，教材例子比较多，操作性比较强。

An introduction to genetic algorithms

介绍：遗传算法简介

Artificial Intelligence: A Modern Approach

介绍：目前已经是第三版，是110多个国家超过1300所大学的教材，有免费的在线AI课程。主要是从数学的角度介绍人工智能、问题求解、知识与推理。

Deep Learning in Neural Networks: An Overview

介绍：本文详细地回顾了监督学习（包括反向传播）、无监督学习、强化学习和进化学习。

课程

Deep Learning Summer School, Montreal 2015

介绍：蒙特利尔2015年深度学习暑期班，旨在面向研究生、工程师和研究人员，他们已经拥有机器学习的一些基础知识，并希望更多地了解深度学习的研究领域。

Deep Learning Summer School, Montreal 2016

介绍：蒙特利尔2016年深度学习暑期班

Deep Learning

介绍：作者为多伦多大学统计学系Ruslan Salakhutdinov，作者介绍了深度学习模型在电力系统的具体运用——如何打造智能电网？

斯坦福大学公开课：机器学习课程

介绍：吴恩达的著名机器学习课程。特别适合机器学习入门者，课程比较系统、全面。收录在网易公开课上。

斯坦福大学深度学习与自然语言处理课程 UFLDL教程 - Ufldl

介绍：本教程将阐述无监督特征学习和深度学习的主要观点。通过学习，你将实现多个功能学习/深度学习算法，能看到它们为你工作，并学习如何应用/适应这些想法到新问题上。但有一定基础，假定你已经了解了机器学习的基本知识（特别是监督学习、逻辑回归、梯度下降的想法）。

李宏毅老師：深度學習課程

介绍：台大李宏毅的课程讲义，以Theano为编程语言，适合大二以上的学生。

周莫煩 python 介紹 CNN (卷積神經網路)

介绍：卷积神经网络是近些年逐步兴起的一种人工神经网络结构，因为利用卷积神经网络在图像和语音识别方面能够给出更优预测结果，这一种技术也被广泛的传播可应用。卷积神经网络最常被应用的方面是计算机的图像识别，不过因为不断地创新，它也被应用在视频分析，自然语言处理、药物发现等等。近期最火的 Alpha Go，让计算机看懂围棋,，同样也是有运用到这门技术。该视频只用了5分钟简练专业地介绍了卷积神经网络。

CS224d: Deep Learning for Natural Language Processing

介绍：斯坦福大学介绍深度学习在自然语言处理中的具体运用的在线课程，包括网页搜索、广告、电子邮件、客户服务、语言翻译、放射学报告等。

加州理工《机器学习视频库》

介绍：加州理工大学机器学习视频课程，课程里面的图表做得特别漂亮。

Underactuated Robotics

介绍:MIT的Underactuated Robotics于 2014年10月1日开课，该课属于MIT研究生级别的课程，对机器人和非线性动力系统感兴趣的朋友不妨可以挑战一下这门课程！

百度余凯&张潼机器学习视频

介绍：百度出品的机器学习课程。如果你从事互联网搜索，在线广告，用户行为分析，图像识别，自然语言理解，或者生物信息学，智能机器人，金融预测，那么这门核心课程你必须深入了解。

Machine Learning - Stanford by Andrew Ng in Coursera (2010-2014)

介绍：斯坦福大学机器学习课程，本课程将广泛介绍机器学习、数据挖掘和统计模式识别。包括：(i) 监督式学习（参数和非参数算法、支持向量机、核函数和神经网络）。(ii) 无监督学习（集群、降维、推荐系统和深度学习）。(iii) 机器学习实例（偏见/方差理论；机器学习和AI领域的创新）。

Machine Learning - Caltech by Yaser Abu-Mostafa (2012-2014)

介绍：由费曼奖得主Yaser Abu-Mostafa教授亲自授课。详细介绍机器学习基本概念和技术。讲座的重点是真正的理解，而不仅仅是“知道”。

Machine Learning - Carnegie Mellon by Tom Mitchell (Spring 2011)

介绍：卡内基梅隆大学2011年春季机器学习课程，主讲人是Tom Mitchell

Neural Networks for Machine Learning - Geoffrey Hinton in Coursera (2012)

介绍：多伦多大学深度学习大牛 Geoffrey Hinton的公开课。介绍了人造神经网络以及如何使用机器学习，适用于语音识别、图像分割和人体姿势识别等。该课程需要一些微积分和Python编程基本知识。

Neural networks class - Hugo Larochelle from Université de Sherbrooke (2013)

介绍：是Hugo Larochelle在Sherbrooke大学神经网络课程的视频。

Deep Learning Course - CILVR lab @ NYU (2014)

介绍：CILVR实验室（计算智能，学习，视觉和机器人）的深度学习课程。该实验室拥有众多教师、博士、学者，研究领域为计算机感知、自然语言理解、机器人技术和卫生保健。

Lecture Videos | Artificial Intelligence | Electrical Engineering and Computer Science | MIT OpenCourseWare - MIT by Patrick Henry Winston (2010)

介绍：麻省理工学院人工智能课程

Vision and learning - computers and brains by Shimon Ullman, Tomaso Poggio, Ethan Meyers @ MIT (2013)

介绍：麻省理工学院课程，主要从计算和生物角度审视相关的学习方法。课程主题包括计算机视觉学习的最新进展和局限性、计算机和大脑的面部处理、突触学习、计算机和大脑中的马尔可夫决策过程。

Convolutional Neural Networks for Visual Recognition - Stanford by Fei-Fei Li, Andrej Karpathy (2016)

介绍：着重介绍视觉识别的卷积神经网络。为2017年春季课程，比较新。

Deep Learning for Natural Language Processing - Stanford

介绍：斯坦福大学自然语言处理课程。

Neural Networks - usherbrooke

介绍：教授雨果Larochelle的神经网络在线课程，面向研究生，包括了神经网络的一些高级课题，比如自动编码、稀疏编码、卷积网络、限制玻尔兹曼机器。

Machine Learning - Oxford (2014-2015)

介绍：牛津大学2014-2015机器学习/深度学习课程，主讲人为Nando de Freitas。

NVIDIA Deep Learning and AI Classes, Education, and Workshops - Nvidia (2015)

介绍：NVIDIA深度学习研究所（DLI）为开发人员准备的课程，学生将探索广泛使用的开源框架以及NVIDIA最新的GPU加速深度学习平台。

Graduate Summer School: Deep Learning, Feature Learning - by Geoffrey Hinton, Yoshua Bengio, Yann LeCun, Andrew Ng, Nando de Freitas and several others @ IPAM, UCLA (2012)

介绍：纯数学应用数学研究所（IPAM）是一个国家科学基础数学研究所，旨在促进数学与广泛的科学技术的互动。该课程为机器学习/深度学习几位大牛共同参与的一门网上课程。

深度学习（中/英） - Udacity/Google by Vincent Vanhoucke and Arpan Chakraborty (2016)

介绍：google的深度学习课程，很多案例都是Kaggle竞赛中的。

Deep Learning - UWaterloo by Prof. Ali Ghodsi at University of Waterloo (2015)

介绍：Ali Ghodsi在You Tube上的深度学习课程。

Statistical Machine Learning - CMU by Prof. Larry Wasserman

介绍：Larry Wasserman在You Tube上的统计机器学习课程。

Deep Learning Course - Yann LeCun (2016)

介绍：Yann LeCun2015年至2016年深度学习课程。

Bay area DL school - Andrew Ng, Yoshua Bengio, Samy Bengio, Andrej Karpathy, Richard Socher, Hugo Larochelle and many others @ Stanford, CA (2016)

介绍：斯坦福大学2016年9月的深度学习课程班。包括多为深度学习教授。

Designing, Visualizing and Understanding Deep Neural Networks - UC Berkeley

介绍：加州伯克利分校深度神经网络设计、可视化课程

UVA Deep Learning Course - MSc in Artificial Intelligence for the University of Amsterdam.

介绍：本课程在阿姆斯特丹大学人工智能硕士课程中讲授。在本课程中，研究了深度学习的理论，即对大数据进行培训的现代多层神经网络。该课程特别关注计算机视觉和语言建模，这可能是深度学习理论中最可识别和令人印象深刻的两个应用之一。课程由Efstratios Gavves 助理教授，Kirill Gavrilyuk，Berkay Kicanaoglu和Patrick Putzky教授。

Deep Learning for Self-Driving Cars Introduction to Deep Learning

介绍：麻省理工学院的一个为期一周的深度学习方法，应用于机器翻译、图像识别、游戏、图像生成等。

Deep Reinforcement Learning

介绍：面向本科生的加州大学伯克利分校的深度加强学习课程（2017年秋季课程）

视频和演讲

How To Create A Mind - Ray Kurzweil

介绍：Ray Kurzweil在TED 的一个演讲，人类mind是怎么形成的或许可以给深度学习研究一点启发。

Deep Learning, Self-Taught Learning and Unsupervised Feature Learning - Andrew Ng

介绍：吴恩达关于深度学习、自学习和无监督特征学习的一个演讲。

Recent Developments in Deep Learning - Geoff Hinton

介绍：深度学习大牛Geoff Hinton关于该理论最新进展的一个介绍视频，该作为UBC计算机科学杰出讲座系列的一部分。

The Unreasonable Effectiveness of Deep Learning - Yann LeCun

介绍：Facebook的AI研究总监Yann LeCun博士就深入卷积神经网络及其机器学习和计算机视觉应用（约翰·霍普金斯大学，语言和语音处理中心，11/18/2014，巴尔的摩，MD）进行了讨论。

Deep Learning of Representations - Yoshua bengio

介绍：来自GoogleTechTalks的一个演讲。

Principles of Hierarchical Temporal Memory - Jeff Hawkins

介绍：杰夫·霍金斯的“分层时态记忆原理（HTM）：机器智能基础”一个演讲。

Machine Learning Discussion Group - Deep Learning w/ Stanford AI Lab by Adam Coates

介绍：Adam Coates将对斯坦福人工智能实验室最近的一些研究项目进行概述，并将对深度学习进行深入的讨论。

Making Sense of the World with Deep Learning - Adam Coates

介绍：从标题就可以看出，讲的是深度学习在现实世界中产生的一些运用。

Demystifying Unsupervised Feature Learning - Adam Coates

介绍：Adam Coates2012年12月7日在加州大学伯克利分校的一个演讲，演讲题目为“神秘无人监督的功能学习”。

Visual Perception with Deep Learning - Yann LeCun

介绍：来自于GoogleTechTalks系列的一个演讲，演讲题目为“深度学习-视觉感知”。

The Next Generation of Neural Networks - Geoffrey Hinton at GoogleTechTalks

介绍：深度学习大牛Hinton在GoogleTechTalks上的一个演讲，演讲题目为“下一代神经网络”。

The wonderful and terrifying implications of computers that can learn - Jeremy Howard at TEDxBrussels

介绍：Howard在TED的一个关于计算机可以自主学习的神奇之处及可怕后果的演讲。

CS294A/CS294W - Unsupervised Deep Learning - Stanford by Andrew Ng in Stanford (2011)

介绍：吴恩达在斯坦福大学2011年对无监督深度学习的讲解。

A beginners Guide to Deep Neural Networks - Natalie Hammel and Lorraine Yurshansky

介绍：从一个很小的例子“你的手机如何识别一只狗”开始介绍深层神经网络入门指南。建议大家可以看看，特别有趣。

Deep Learning: Intelligence from Big Data - Steve Jurvetson (and panel) at VLAB in Stanford.

介绍：斯坦福大学商学研究科2014年的一个题目为“深度学习：大数据的智能”的一个演讲。

Introduction to Artificial Neural Networks and Deep Learning - Leo Isikdogan at Motorola Mobility HQ

介绍：在2016年夏天芝加哥摩托罗拉总部的一次关于深度学习、人工神经网络的一次讨论。

NIPS 2016 lecture and workshop videos - NIPS 2016

介绍：从09年到17年每年的NIPS会议成果主页。

评论论文

Representation Learning: A Review and New Perspectives, Yoshua Bengio, Aaron Courville, Pascal Vincent, Arxiv, 2012.

The monograph or review paper Learning Deep Architectures for AI (Foundations & Trends in Machine Learning, 2009).

Deep Machine Learning – A New Frontier in Artificial Intelligence Research – a survey paper by Itamar Arel, Derek C. Rose, and Thomas P. Karnowski.

Graves, A. (2012). Supervised sequence labelling with recurrent neural networks(Vol. 385). Springer.

Schmidhuber, J. (2014). Deep Learning in Neural Networks: An Overview. 75 pages, 850+ references

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep learning.” Nature 521, no. 7553 (2015): 436-444.

增强学习

Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. “Playing Atari with deep reinforcement learning.” arXiv preprint arXiv:1312.5602 (2013).

Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu. “Recurrent Models of Visual Attention” ArXiv e-print, 2014.

计算机视觉

ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, NIPS 2012.

Going Deeper with Convolutions, Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, 19-Sept-2014.

Learning Hierarchical Features for Scene Labeling, Clement Farabet, Camille Couprie, Laurent Najman and Yann LeCun, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013.

Learning Convolutional Feature Hierachies for Visual Recognition, Koray Kavukcuoglu, Pierre Sermanet, Y-Lan Boureau, Karol Gregor, Michaël Mathieu and Yann LeCun, Advances in Neural Information Processing Systems (NIPS 2010), 23, 2010.

Graves, Alex, et al. “A novel connectionist system for unconstrained handwriting recognition.” Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.5 (2009): 855-868.

Cireşan, D. C., Meier, U., Gambardella, L. M., & Schmidhuber, J. (2010). Deep, big, simple neural nets for handwritten digit recognition. Neural computation, 22(12), 3207-3220.

Ciresan, Dan, Ueli Meier, and Jürgen Schmidhuber. “Multi-column deep neural networks for image classification.” Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.

Ciresan, D., Meier, U., Masci, J., & Schmidhuber, J. (2011, July). A committee of neural networks for traffic sign classification. In Neural Networks (IJCNN), The 2011 International Joint Conference on (pp. 1918-1921). IEEE.

NLP和语言处理

Joint Learning of Words and Meaning Representations for Open-Text Semantic Parsing, Antoine Bordes, Xavier Glorot, Jason Weston and Yoshua Bengio (2012), in: Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS)

Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. Socher, R., Huang, E. H., Pennington, J., Ng, A. Y., and Manning, C. D. (2011a). In NIPS’2011.

Semi-supervised recursive autoencoders for predicting sentiment distributions. Socher, R., Pennington, J., Huang, E. H., Ng, A. Y., and Manning, C. D. (2011b). In EMNLP’2011.

Mikolov Tomáš: Statistical Language Models based on Neural Networks. PhD thesis, Brno University of Technology, 2012.

Graves, Alex, and Jürgen Schmidhuber. “Framewise phoneme classification with bidirectional LSTM and other neural network architectures.” Neural Networks 18.5 (2005): 602-610.

Mikolov, Tomas, Ilya Sutskever, Kai Chen, Greg S. Corrado, and Jeff Dean. “Distributed representations of words and phrases and their compositionality.” In Advances in Neural Information Processing Systems, pp. 3111-3119. 2013.

K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, Y. Bengio. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. EMNLP 2014.

Sutskever, Ilya, Oriol Vinyals, and Quoc VV Le. “Sequence to sequence learning with neural networks.” Advances in Neural Information Processing Systems. 2014.

迁移学习和域适应

Raina, Rajat, et al. “Self-taught learning: transfer learning from unlabeled data.” Proceedings of the 24th international conference on Machine learning. ACM, 2007.

Xavier Glorot, Antoine Bordes and Yoshua Bengio, Domain Adaptation for Large-Scale Sentiment Classification: A Deep Learning Approach, in: Proceedings of the Twenty-eight International Conference on Machine Learning (ICML’11), pages 97-110, 2011.

R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu and P. Kuksa. Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research, 12:2493-2537, 2011.

Mesnil, Grégoire, et al. “Unsupervised and transfer learning challenge: a deep learning approach.” Unsupervised and Transfer Learning Workshop, in conjunction with ICML. 2011.

Ciresan, D. C., Meier, U., & Schmidhuber, J. (2012, June). Transfer learning for Latin and Chinese characters with deep neural networks. In Neural Networks (IJCNN), The 2012 International Joint Conference on (pp. 1-6). IEEE.

Goodfellow, Ian, Aaron Courville, and Yoshua Bengio. “Large-Scale Feature Learning With Spike-and-Slab Sparse Coding.” ICML 2012.

实用技巧与指导

“Improving neural networks by preventing co-adaptation of feature detectors.” Hinton, Geoffrey E., et al. arXiv preprint arXiv:1207.0580 (2012).

Practical recommendations for gradient-based training of deep architectures, Yoshua Bengio, U. Montreal, arXiv report:1206.5533, Lecture Notes in Computer Science Volume 7700, Neural Networks: Tricks of the Trade Second Edition, Editors: Grégoire Montavon, Geneviève B. Orr, Klaus-Robert Müller, 2012.

A practical guide to training Restricted Boltzmann Machines, by Geoffrey Hinton.

稀疏编码

Emergence of simple-cell receptive field properties by learning a sparse code for natural images, Bruno Olhausen, Nature 1996.

Kavukcuoglu, Koray, Marc’Aurelio Ranzato, and Yann LeCun. “Fast inference in sparse coding algorithms with applications to object recognition.” arXiv preprint arXiv:1010.3467 (2010).

Goodfellow, Ian, Aaron Courville, and Yoshua Bengio. “Large-Scale Feature Learning With Spike-and-Slab Sparse Coding.” ICML 2012.

Efficient sparse coding algorithms. Honglak Lee, Alexis Battle, Raina Rajat and Andrew Y. Ng. In NIPS 19, 2007. pdf

“Sparse coding with an overcomplete basis set: A strategy employed by VI?.” . Olshausen, Bruno A., and David J. Field. Vision research 37.23 (1997): 3311-3326.

基础理论与动机

LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. “Deep learning.” Nature 521, no. 7553(2015): 436-444.

Hinton, Geoffrey E. “Deterministic Boltzmann learning performs steepest descent in weight-space.” Neural computation 1.1 (1989): 143-150.

Bengio, Yoshua, and Samy Bengio. “Modeling high-dimensional discrete data with multi-layer neural networks.” Advances in Neural Information Processing Systems 12 (2000): 400-406.

Bengio, Yoshua, et al. “Greedy layer-wise training of deep networks.” Advances in neural information processing systems 19 (2007): 153.

Bengio, Yoshua, Martin Monperrus, and Hugo Larochelle. “Nonlocal estimation of manifold structure.” Neural Computation 18.10 (2006): 2509-2528.

Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. “Reducing the dimensionality of data with neural networks.” Science 313.5786 (2006): 504-507.

Marc’Aurelio Ranzato, Y., Lan Boureau, and Yann LeCun. “Sparse feature learning for deep belief networks.” Advances in neural information processing systems 20 (2007): 1185-1192.

Bengio, Yoshua, and Yann LeCun. “Scaling learning algorithms towards AI.” Large-Scale Kernel Machines 34 (2007).

Le Roux, Nicolas, and Yoshua Bengio. “Representational power of restricted boltzmann machines and deep belief networks.” Neural Computation 20.6 (2008): 1631-1649.

Le Roux, Nicolas, and Yoshua Bengio. “Deep belief networks are compact universal approximators.” Neural computation 22.8 (2010): 2192-2207.

Bengio, Yoshua, and Olivier Delalleau. “On the expressive power of deep architectures.” Algorithmic Learning Theory. Springer Berlin/Heidelberg, 2011.

Montufar, Guido F., and Jason Morton. “When Does a Mixture of Products Contain a Product of Mixtures?.” arXiv preprint arXiv:1206.0387 (2012).

Montúfar, Guido, Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. “On the Number of Linear Regions of Deep Neural Networks.” arXiv preprint arXiv:1402.1869 (2014).

监督式前馈神经网络

The Manifold Tangent Classifier, Salah Rifai, Yann Dauphin, Pascal Vincent, Yoshua Bengio and Xavier Muller, in: NIPS’2011.

“Discriminative Learning of Sum-Product Networks.“, Gens, Robert, and Pedro Domingos, NIPS 2012 Best Student Paper.

Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. (2013). Maxout networks. Technical Report, Universite de Montreal.

Hinton, Geoffrey E., et al. “Improving neural networks by preventing co-adaptation of feature detectors.” arXiv preprint arXiv:1207.0580 (2012).

Wang, Sida, and Christopher Manning. “Fast dropout training.” In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 118-126. 2013.

Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. “Deep sparse rectifier networks.” In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume, vol. 15, pp. 315-323. 2011.

ImageNet Classification with Deep Convolutional Neural Networks, Alex Krizhevsky, Ilya Sutskever, Geoffrey E Hinton, NIPS 2012.

大规模深度学习

Building High-level Features Using Large Scale Unsupervised Learning Quoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. Ng, ICML 2012.

Bengio, Yoshua, et al. “Neural probabilistic language models.” Innovations in Machine Learning (2006): 137-186. Specifically Section 3 of this paper discusses the asynchronous SGD.

Dean, Jeffrey, et al. “Large scale distributed deep networks.” Advances in Neural Information Processing Systems. 2012.

循环神经网络

Training Recurrent Neural Networks, Ilya Sutskever, PhD Thesis, 2012.

Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. “Learning long-term dependencies with gradient descent is difficult.” Neural Networks, IEEE Transactions on 5.2 (1994): 157-166.

Mikolov Tomáš: Statistical Language Models based on Neural Networks. PhD thesis, Brno University of Technology, 2012.

Hochreiter, Sepp, and Jürgen Schmidhuber. “Long short-term memory.” Neural computation 9.8 (1997): 1735-1780.

Hochreiter, S., Bengio, Y., Frasconi, P., & Schmidhuber, J. (2001). Gradient flow in recurrent nets: the difficulty of learning long-term dependencies.

Schmidhuber, J. (1992).Learning complex, extended sequences using the principle of history compression. Neural Computation, 4(2), 234-242.

Graves, A., Fernández, S., Gomez, F., & Schmidhuber, J. (2006, June). Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the 23rd international conference on Machine learning (pp. 369-376). ACM.

Hinton, Geoffrey E. “Deterministic Boltzmann learning performs steepest descent in weight-space.” Neural computation 1.1 (1989): 143-150.

Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. “Reducing the dimensionality of data with neural networks.” Science 313.5786 (2006): 504-507.

超参数

“Practical Bayesian Optimization of Machine Learning Algorithms”, Jasper Snoek, Hugo Larochelle, Ryan Adams, NIPS 2012.

Random Search for Hyper-Parameter Optimization, James Bergstra and Yoshua Bengio (2012), in: Journal of Machine Learning Research, 13(281–305).

Algorithms for Hyper-Parameter Optimization, James Bergstra, Rémy Bardenet, Yoshua Bengio and Balázs Kégl, in: NIPS’2011, 2011.

优化

Training Deep and Recurrent Neural Networks with Hessian-Free Optimization, James Martens and Ilya Sutskever, Neural Networks: Tricks of the Trade, 2012.

Schaul, Tom, Sixin Zhang, and Yann LeCun. “No More Pesky Learning Rates.” arXiv preprint arXiv:1206.1106 (2012).

Le Roux, Nicolas, Pierre-Antoine Manzagol, and Yoshua Bengio. “Topmoumoute online natural gradient algorithm.” Neural Information Processing Systems (NIPS). 2007.

Bordes, Antoine, Léon Bottou, and Patrick Gallinari. “SGD-QN: Careful quasi-Newton stochastic gradient descent.” The Journal of Machine Learning Research 10 (2009): 1737-1754.

Glorot, Xavier, and Yoshua Bengio. “Understanding the difficulty of training deep feedforward neural networks.” Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS’10). Society for Artificial Intelligence and Statistics. 2010.

Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. “Deep Sparse Rectifier Networks.” Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume. Vol. 15. 2011.

“Deep learning via Hessian-free optimization.” Martens, James. Proceedings of the 27th International Conference on Machine Learning (ICML). Vol. 951. 2010.

Hochreiter, Sepp, and Jürgen Schmidhuber. “Flat minima.” Neural Computation, 9.1 (1997): 1-42.

Pascanu, Razvan, and Yoshua Bengio. “Revisiting natural gradient for deep networks.” arXiv preprint arXiv:1301.3584 (2013).

Dauphin, Yann N., Razvan Pascanu, Caglar Gulcehre, Kyunghyun Cho, Surya Ganguli, and Yoshua Bengio. “Identifying and attacking the saddle point problem in high-dimensional non-convex optimization.” In Advances in Neural Information Processing Systems, pp. 2933-2941. 2014.

无监督特征学习

Salakhutdinov, Ruslan, and Geoffrey E. Hinton. “Deep boltzmann machines.” Proceedings of the international conference on artificial intelligence and statistics. Vol. 5. No. 2. Cambridge, MA: MIT Press, 2009.

Scholarpedia page on Deep Belief Networks.

Deep Boltzmann Machines

An Efficient Learning Procedure for Deep Boltzmann Machines, Ruslan Salakhutdinov and Geoffrey Hinton, Neural Computation August 2012, Vol. 24, No. 8: 1967 — 2006.

Montavon, Grégoire, and Klaus-Robert Müller. “Deep Boltzmann Machines and the Centering Trick.” Neural Networks: Tricks of the Trade (2012): 621-637.

Salakhutdinov, Ruslan, and Hugo Larochelle. “Efficient learning of deep boltzmann machines.” International Conference on Artificial Intelligence and Statistics. 2010.

Salakhutdinov, Ruslan. Learning deep generative models. Diss. University of Toronto, 2009.

Goodfellow, Ian, et al. “Multi-prediction deep Boltzmann machines.” Advances in Neural Information Processing Systems. 2013.

Unsupervised Models of Images by Spike-and-Slab RBMs, Aaron Courville, James Bergstra and Yoshua Bengio, in: ICML’2011

Hinton, Geoffrey. “A practical guide to training restricted Boltzmann machines.” Momentum 9.1 (2010): 926.

自动编码

Regularized Auto-Encoders Estimate Local Statistics, Guillaume Alain, Yoshua Bengio and Salah Rifai, Université de Montréal, arXiv report 1211.4246, 2012

A Generative Process for Sampling Contractive Auto-Encoders, Salah Rifai, Yoshua Bengio, Yann Dauphin and Pascal Vincent, in: ICML’2012, Edinburgh, Scotland, U.K., 2012

Contracting Auto-Encoders: Explicit invariance during feature extraction, Salah Rifai, Pascal Vincent, Xavier Muller, Xavier Glorot and Yoshua Bengio, in: ICML’2011

Disentangling factors of variation for facial expression recognition, Salah Rifai, Yoshua Bengio, Aaron Courville, Pascal Vincent and Mehdi Mirza, in: ECCV’2012.

Vincent, Pascal, et al. “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.” The Journal of Machine Learning Research 11 (2010): 3371-3408.

Vincent, Pascal. “A connection between score matching and denoising autoencoders.” Neural computation 23.7 (2011): 1661-1674.

Chen, Minmin, et al. “Marginalized denoising autoencoders for domain adaptation.” arXiv preprint arXiv:1206.4683 (2012).

其他

The ICML 2009 Workshop on Learning Feature Hierarchies webpage has a reading list.

Stanford’s UFLDL Recommended Readings.

The LISApublic wiki has areading list and a bibliography.

Geoff Hinton has readings NIPS 2007 tutorial.

The LISA publications database contains a deep architectures category.

A very brief introduction to AI, Machine Learning, and Deep Learning in Yoshua Bengio‘s IFT6266 graduate class

Memkite’s deep learning reading list, DeepLearning.University - An Annotated Deep Learning Bibliography.

Deep learning resources page, Jeremy D. Jackson, PhD

Goodfellow, Ian, et al. “Measuring invariances in deep networks.” Advances in neural information processing systems 22 (2009): 646-654.

Bengio, Yoshua, et al. “Better Mixing via Deep Representations.” arXiv preprint arXiv:1207.4404 (2012).

欢迎补充。

文中提供的网页链接，均来自于网络，如有问题，请站内告知。

原创出品，转载请先获得作者 BigQuant 同意！

深度学习资源汇总

相关文章推荐：

文章被以下专栏收录

人工智能邂逅量化投资