自然语言处理(NLP)是语言学,计算机科学,信息工程和人工智能的一个子领域,与计算机和人类(自然)语言之间的相互作用有关,尤其是如何对计算机进行编程以处理和分析大量自然语言数据 。

自然语言处理(NLP) 专知荟萃

入门学习

  1. 《数学之美》吴军 这个书写得特别生动形象,没有太多公式,科普性质。看完对于nlp的许多技术原理都会有初步认识。可以说是自然语言处理最好的入门读物。

  2. 如何在NLP领域第一次做成一件事 by 周明 微软亚洲研究院首席研究员、自然语言处理顶会ACL候任主席:

  3. 深度学习基础

  4. Deep learning for natural language processing 自然语言处理中的深度学习 by 邱锡鹏

    • 主要讨论了深度学习在自然语言处理中的应用。其中涉及的模型主要有卷积神经网络,递归神经网络,循环神经网络网络等,应用领域主要包括了文本生成,问答系统,机器翻译以及文本匹配等。
    • [http://nlp.fudan.edu.cn/xpqiu/slides/20160618_DL4NLP@CityU.pdf]
  5. Deep Learning, NLP, and Representations (深度学习,自然语言处理及其表达)

  6. 《中文信息发展报告》 by 中国中文信息学会 2016年12月

  7. Deep Learning in NLP (一)词向量和语言模型 by Lai Siwei(来斯惟) 中科院自动化所 2013

  8. 语义分析的一些方法(一,二,三) by 火光摇曳 腾讯广点通

  9. 我们是这样理解语言的-3 神经网络语言模型 by 火光摇曳 腾讯广点通

  10. 深度学习word2vec笔记之基础篇 by falao_beiliu

  11. Understanding Convolutional Neural Networks for NLP 卷积神经网络在自然语言处理的应用 by WILDML

  12. The Unreasonable Effectiveness of Recurrent Neural Networks. 循环神经网络惊人的有效性 by Andrej Karpathy

  13. Understanding LSTM Networks 理解长短期记忆网络(LSTM NetWorks) by colah

  14. 注意力机制(Attention Mechanism)在自然语言处理中的应用 by robert_ai _

  15. 初学者如何查阅自然语言处理(NLP)领域学术资料  刘知远

综述

  1. A Primer on Neural Network Models for Natural Language Processing Yoav Goldberg. October 2015. No new info, 75 page summary of state of the art.
  2. Deep Learning for Web Search and Natural Language Processing - [https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/wsdm2015.v3.pdf]
  3. Probabilistic topic models
  4. Natural language processing: an introduction
  5. A unified architecture for natural language processing: Deep neural networks with multitask learning
  6. A Critical Review of Recurrent Neural Networksfor Sequence Learning - [http://arxiv.org/pdf/1506.00019v1.pdf]
  7. Deep parsing in Watson - [http://nlp.cs.rpi.edu/course/spring14/deepparsing.pdf]
  8. Online named entity recognition method for microtexts in social networking services: A case study of twitter
  9. 《基于神经网络的词和文档语义向量表示方法研究》 by Lai Siwei(来斯惟) 中科院自动化所 2016
    • 来斯惟的博士论文基于神经网络的词和文档语义向量表示方法研究,全面了解词向量、神经网络语言模型相关的内容。
    • [https://arxiv.org/pdf/1611.05962.pdf]

进阶论文

Word Vectors

  1. Word2vec Efficient Estimation of Word Representations in Vector Space
  2. ** Doc2vec** Distributed Representations of Words and Phrases and their Compositionality
  3. Word2Vec tutorial
  4. GloVe : Global vectors for word representation
  5. How to Generate a Good Word Embedding? 怎样生成一个好的词向量? Siwei Lai, Kang Liu, Liheng Xu, Jun Zhao
  6. tweet2vec
  7. tweet2vec
  8. author2vec
  9. item2vec
  10. lda2vec
  11. illustration2vec
  12. tag2vec
  13. category2vec
  14. topic2vec
  15. image2vec
  16. app2vec
  17. prod2vec
  18. metaprod2vec
  19. sense2vec
  20. node2vec
  21. subgraph2vec
  22. wordnet2vec
  23. doc2sent2vec
  24. context2vec
  25. rdf2vec
  26. hash2vec
  27. query2vec
  28. gov2vec
  29. novel2vec
  30. emoji2vec
  31. video2vec
  32. video2vec
  33. sen2vec
  34. content2vec
  35. cat2vec
  36. diet2vec
  37. mention2vec
  38. POI2vec
  39. wang2vec
  40. dna2vec
  41. pin2vec
  42. paper2vec
  43. struc2vec
  44. med2vec
  45. net2vec
  46. sub2vec
  47. metapath2vec
  48. concept2vec
  49. graph2vec
  50. doctag2vec
  51. skill2vec
  52. style2vec
  53. ngram2vec

Machine Translation

  1. Neural Machine Translation by jointly learning to align and translate
  2. Sequence to Sequence Learning with Neural Networks
  3. Cross-lingual Pseudo-Projected Expectation Regularization for Weakly Supervised Learning
  4. Generating Chinese Named Entity Data from a Parallel Corpus
  5. IXA pipeline: Efficient and Ready to Use Multilingual NLP tools

Summarization

  1. Extraction of Salient Sentences from Labelled Documents
  2. A Neural Attention Model for Abstractive Sentence Summarization. EMNLP 2015. Facebook AI Research
  3. A Convolutional Attention Network for Extreme Summarization of Source Code
  4. Abstractive Text Summarization Using SequencetoSequence RNNs and Beyond. BM Watson & Université de Montréal
  5. textsum: Text summarization with TensorFlow
  6. How to Run Text Summarization with TensorFlow

Text Classification

  1. Convolutional Neural Networks for Sentence Classification
  2. Recurrent Convolutional Neural Networks for Text Classification
  3. Characterlevel Convolutional Networks for Text Classification.NIPS 2015. "Text Understanding from Scratch"
  4. A CLSTM Neural Network for Text Classification
  5. Text classification using DIGITS and Torch7
  6. Recurrent Neural Network for Text Classification with MultiTask Learning
  7. Deep MultiTask Learning with Shared Memory. EMNLP 2016
  8. Virtual Adversarial Training for SemiSupervised Text
  9. Bag of Tricks for Efficient Text Classification. Facebook AI Research
  10. Actionable and Political Text Classification using Word Embeddings and LSTM
  11. fancycnn: Multiparadigm Sequential Convolutional Neural Networks for text classification
  12. Convolutional Neural Networks for Text Categorization: Shallow Wordlevel vs. Deep Characterlevel
  13. Hierarchical Attention Networks for Document Classification. NAACL 2016
  14. ACBLSTM: Asymmetric Convolutional Bidirectional LSTM Networks for Text Classification
  15. Generative and Discriminative Text Classification with Recurrent Neural Networks. DeepMind
  16. Adversarial Multitask Learning for Text Classification. ACL 2017
  17. Deep Text Classification Can be Fooled. Renmin University of China
  18. Deep neural network framework for multilabel text classification
  19. MultiTask Label Embedding for Text Classification

 Dialogs

  1. A Neural Network Approach toContext-Sensitive Generation of Conversational Responses. by Sordoni 2015. Generates responses to tweets.
  2. Neural Responding Machine for Short-Text Conversation
  3. A Neural Conversation Model
  4. Visual Dialog
  5. Papers, code and data from FAIR for various memory-augmented nets with application to text understanding and dialogue.
  6. Neural Emoji Recommendation in Dialogue Systems

Reading Comprehension

  1. Text Understanding with the Attention Sum Reader Network. ACL 2016
  2. A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task
  3. Consensus Attentionbased Neural Networks for Chinese Reading Comprehension
  4. Separating Answers from Queries for Neural Reading Comprehension
  5. AttentionoverAttention Neural Networks for Reading Comprehension
  6. Teaching Machines to Read and Comprehend CNN News and Children Books using Torch
  7. Reasoning with Memory Augmented Neural Networks for Language Comprehension
  8. Bidirectional Attention Flow: Bidirectional Attention Flow for Machine Comprehension
  9. NewsQA: A Machine Comprehension Dataset
  10. GatedAttention Readers for Text Comprehension
  11. Get To The Point: Summarization with PointerGenerator Networks. ACL 2017. Stanford University & Google Brain

Memory and Attention Models

  1. Reasoning, Attention and Memory RAM workshop at NIPS 2015.
  2. Memory Networks. Weston et. al 2014
  3. End-To-End Memory Networks
  4. Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks
  5. Evaluating prerequisite qualities for learning end to end dialog systems
  6. Neural Turing Machines
  7. Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets
  8. Reasoning about Neural Attention
  9. A Neural Attention Model for Abstractive Sentence Summarization
  10. Neural Machine Translation by Jointly Learning to Align and Translate
  11. Recurrent Continuous Translation Models
  1. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
  1. Teaching Machines to Read and Comprehend

Reinforcement learning in nlp

  1. Generating Text with Deep Reinforcement Learning - [https://arxiv.org/abs/1510.09202]
  2. Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
  3. Language Understanding for Text-based Games using Deep Reinforcement Learning
  4. On-line Active Reward Learning for Policy Optimisation in Spoken Dialogue Systems
  5. Deep Reinforcement Learning with a Natural Language Action Space
  6. 基于DQN的开放域多轮对话策略学习  宋皓宇, 张伟男 and 刘挺 SMP2017 最佳论文奖 2017

GAN for NLP

  1. Generating Text via Adversarial Training
  2. SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient
  3. Adversarial Learning for Neural Dialogue Generation
  4. GANs for sequence of discrete elements with the Gumbel-softmax distribution
  5. Connecting generative adversarial network and actor-critic methods

视频课程

  1. Introduction to Natural Language Processing(自然语言处理导论) 密歇根大学 - [https://www.coursera.org/learn/natural-language-processing]
  2. 斯坦福 cs224d 2015年课程 Deep Learning for Natural Language Processing by Richard Socher [2015 classes] - [https://www.youtube.com/playlist?list=PLmImxx8Char8dxWB9LRqdpCTmewaml96q]
  3. 斯坦福 cs224d 2016年课程 Deep Learning for Natural Language Processing by Richard Socher. Updated to make use of Tensorflow.
  4. 斯坦福 cs224n 2017年课程 Deep Learning for Natural Language Processing by Chris Manning Richard Socher
  5. Natural Language Processing - by 哥伦比亚大学 Mike Collins - [https://www.coursera.org/learn/nlangp]
  6. NLTK with Python 3 for Natural Language Processing by Harrison Kinsley. Good tutorials with NLTK code implementation.
  7. Computational Linguistics by Jordan Boyd-Graber . Lectures from University of Maryland.
  8. Natural Language Processing - Stanford by Dan Jurafsky & Chris Manning.

Tutorial

  1. Deep Learning for Natural Language Processing (without Magic)

  2. A Primer on Neural Network Models for Natural Language Processing

  3. Deep Learning for Natural Language Processing: Theory and Practice [Tutorial]

  4. Recurrent Neural Networks with Word Embeddings

  5. LSTM Networks for Sentiment Analysis

  6. Semantic Representations of Word Senses and Concepts 语义表示 ACL 2016 Tutorial by José Camacho-Collados, Ignacio Iacobacci, Roberto Navigli and Mohammad Taher Pilehvar

  7. ACL 2016 Tutorial: Understanding Short Texts 短文本理解

  8. Practical Neural Networks for NLP  EMNLP 2016

  9. Structured Neural Networks for NLP: From Idea to Code

  10. Understanding Deep Learning Models in NLP

  11. Deep learning for natural language processing, Part 1

  12. TensorFlow Tutorial on Seq2Seq Models

  13. Natural Language Understanding with Distributed Representation Lecture Note by Cho

  14. Michael Collins

  15. Several tutorials by Radim Řehůřek[https://radimrehurek.com/gensim/tutorial.html] on using Python and genism

  16. Natural Language Processing in Action

图书

  1. 《数学之美》(吴军)
    • 科普性质,看完对于nlp的许多技术原理都会有初步认识
  2. 《自然语言处理综论》(Daniel Jurafsky)
    • 这本书是冯志伟老师翻译的 作者是Daniel Jurafsky,在coursera上面有他的课程。
    • 本书第三版正尚未出版,但是英文版已经全部公开。
    • Speech and Language Processing (3rd ed. draft) by Dan Jurafsky and James H. Martin
    • [https://web.stanford.edu/~jurafsky/slp3/]
  3. 《自然语言处理简明教程》(冯志伟)
  4. 《统计自然语言处理(第2版)》(宗成庆)
  5. 清华大学刘知远老师等合著的《互联网时代的机器学习和自然语言处理技术大数据智能》,科普性质。

领域专家

国内

  1. 清华大学
    • NLP研究:孙茂松主要从事一些中文文本处理工作,比如中文文本分类,中文分词。刘知远从事关键词抽取,表示学习,知识图谱以及社会计算。刘洋从事数据驱动的机器学习。
    • 情感分析:黄民烈
    • 信息检索:刘奕群、马少平
    • 语音识别——王东
    • 社会计算:唐杰
  2. 哈尔滨工业大学
    • 社会媒体处理:刘挺、丁效
    • 情感分析:秦兵 车万翔
  3. 中科院
    • 语言认知模型:王少楠,宗成庆
    • 信息抽取:孙乐、韩先培
    • 信息推荐与过滤:王斌(中科院信工所)、鲁骁(国家计算机网络应急中心)
    • 自动问答:赵军、刘康,何世柱(中科院自动化研究所)
    • 机器翻译:张家俊、宗成庆(中科院自动化研究所)
    • 语音 合成——陶建华(中科院自动化研究所)
    • 文字识别:刘成林(中科院自动化研究所)
    • 文本匹配:郭嘉丰
  4. 北京大学
    • 篇章分析:王厚峰、李素建
    • 自动文摘,情感分析:万小军、姚金戈
    • 语音技术:说话人识别——郑方
    • 多模态信息处理:陈晓鸥
    • 冯岩松
  5. 复旦大学
    • 语言表示与深度学习:黄萱菁、邱锡鹏
  6. 苏州大学
    • 词法与句法分析:李正华、陈文亮、张民
    • 语义分析:周国栋、李军
    • 机器翻译:熊德意
  7. 中国人民大学
    • 表示学习,推荐系统:赵鑫
  8. 微软亚洲研究院自然语言计算组
    • 周明 刘铁岩 谢幸
  9. 头条人工智能实验室
    • 李航
  10. 华为诺亚
    • 前任 李航 吕正东

国际

  1. 斯坦福大学
    • 知名的NLP学者:Daniel Jurafsky, Christopher Manning, Percy Liang和Chris Potts, Richard Socher
    • NLP研究:Jurafsky和科罗拉多大学波尔得分校的James Martin合著自然语言处理方面的教材。这个NLP研究组从事几乎所有能够想象到的研究方向。今天NLP领域最被广泛使用的句法分析器和词性标注工具可能都是他们负责开发的。
    • [http://nlp.stanford.edu/]
  2. 加州大学圣巴巴拉分校
    • 知名NLP学者:William Wang(王威廉), Fermin Moscoso del Prado Martin
    • NLP研究:William研究方向为信息抽取和机器学习,Fermin研究方向为心理语言学和计量语言学。
    • [http://www.cs.ucsb.edu/~william] William Wang(王威廉)经常在微博分享关于NLP的最近进展和趣事,几乎每条都提供高质量的信息。
    • 微博:[https://www.weibo.com/u/1657470871]
  3. 加州大学圣迭戈分校
    • 知名的NLP学者:Lawrence Saul(Roger Levy今年加入MIT)
    • NLP研究:主要研究方向是机器学习,NLP相关的工作不是很多,但是在计算心理语言学有些比较有趣的工作。
    • [http://grammar.ucsd.edu/cpl/]
  4. 加州大学圣克鲁兹分校
  5. 卡内基梅隆大学
    • 知名NLP学者:Jaime Carbonell,