文本集群的终端到终端神经网络框架 (An end-to-end Neural Network Framework for Text Clustering) - 专知论文

会员服务 ·

0

簇 · 文本聚类 · MoDELS · 端到端 · Neural Networks ·

2019 年 3 月 22 日

An end-to-end Neural Network Framework for Text Clustering

翻译：文本集群的终端到终端神经网络框架

Jie Zhou,Xingyi Cheng,Jinchao Zhang

The unsupervised text clustering is one of the major tasks in natural language processing (NLP) and remains a difficult and complex problem. Conventional \mbox{methods} generally treat this task using separated steps, including text representation learning and clustering the representations. As an improvement, neural methods have also been introduced for continuous representation learning to address the sparsity problem. However, the multi-step process still deviates from the unified optimization target. Especially the second step of cluster is generally performed with conventional methods such as k-Means. We propose a pure neural framework for text clustering in an end-to-end manner. It jointly learns the text representation and the clustering model. Our model works well when the context can be obtained, which is nearly always the case in the field of NLP. We have our method \mbox{evaluated} on two widely used benchmarks: IMDB movie reviews for sentiment classification and $20$-Newsgroup for topic categorization. Despite its simplicity, experiments show the model outperforms previous clustering methods by a large margin. Furthermore, the model is also verified on English wiki dataset as a large corpus.

翻译：未监督的文本分组是自然语言处理(NLP)的主要任务之一,并且仍然是一个困难和复杂的问题。常规的\mbox{methods} 通常使用不同的步骤处理这项任务, 包括文本代表学习和演示组群。作为改进, 还引入了神经学方法, 用于连续的演示学习, 以解决空间问题。但是, 多步骤过程仍然偏离统一优化目标。特别是组合的第二步通常使用传统方法, 如 k- Means 。我们提议了纯神经框架, 用于以端到端的方式进行文本分组。它共同学习文本表达和组群模式。我们的模型在获得上下文时效果良好, 几乎都是 NLP 领域的情况。我们在两种广泛使用的基准上采用了我们的方法\mbox{ 评价} : 情绪分类的IMDB 电影审查和专题分类的20美元新组。尽管它很简单, 我们的实验显示模型比以往的组合方法大差。此外, 该模型在 wiki 数据集中也得到验证。

6

相关内容

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

专知会员服务

69+阅读 · 2020年6月19日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

46+阅读 · 2020年4月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

专知会员服务

26+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

内涵网络嵌入：Content-rich Network Embedding

内涵网络嵌入：Content-rich Network Embedding

我爱读PAMI

4+阅读 · 2019年11月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

Multi-Label Text Classification using Attention-based Graph Neural Network

Arxiv

46+阅读 · 2020年3月22日

Text Level Graph Neural Network for Text Classification

Text Level Graph Neural Network for Text Classification

Arxiv

8+阅读 · 2019年10月6日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

An End-to-End Baseline for Video Captioning

Arxiv

6+阅读 · 2019年4月4日

Deep Node Ranking: an Algorithm for Structural Network Embedding and End-to-End Classification

Deep Node Ranking: an Algorithm for Structural Network Embedding and End-to-End Classification

Arxiv

4+阅读 · 2019年2月11日

CoCoNet: A Collaborative Convolutional Network

CoCoNet: A Collaborative Convolutional Network

Arxiv

6+阅读 · 2019年1月28日

An Attention-Gated Convolutional Neural Network for Sentence Classification

An Attention-Gated Convolutional Neural Network for Sentence Classification

Arxiv

4+阅读 · 2018年12月28日

End-to-End Text Classification via Image-based Embedding using Character-level Networks

End-to-End Text Classification via Image-based Embedding using Character-level Networks

Arxiv

5+阅读 · 2018年10月10日

Graph Convolutional Networks for Text Classification

Arxiv

12+阅读 · 2018年9月15日

Hierarchical Pointer Memory Network for Task Oriented Dialogue

Arxiv

3+阅读 · 2018年5月3日

VIP会员

文章信息

相关主题

Neural Networks

相关VIP内容

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

【CVPR2020】在线深度聚类的无监督表示学习, Online Deep Clustering for Unsupervised Representation Learning

专知会员服务

69+阅读 · 2020年6月19日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

【CMU】图卷积神经网络中的池化综述，Pooling in Graph Convolutional Neural Network

专知会员服务

46+阅读 · 2020年4月8日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

【新书：机器学习简介】《A Concise Introduction to Machine Learning》by A.C. Faul (CRC 2019)

专知会员服务

77+阅读 · 2020年2月8日

【NLP| 推荐文章】语言语音处理（Speech and Language Processing(3rd ed.draft)）

专知会员服务

15+阅读 · 2019年11月24日

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

【NLP| 推荐文章】神经阅读理解与超越（Neural Reading Comprehension And Beyond）

专知会员服务

26+阅读 · 2019年11月23日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

【人工智能在2019：一年回顾】反人工智能，AI in 2019: A Year in Review

专知会员服务

79+阅读 · 2019年10月10日

热门VIP内容

开通专知VIP会员享更多权益服务

【CMU博士论文】数据驱动决策中的激励、信息与不确定性

DGP双粒度提示框架：图增强大模型助力欺诈检测

【ICCV2025】ESSENTIAL：用于视频类增量学习的情景记忆与语义记忆整合

唯快不破：大型语言模型高效架构综述

相关资讯

内涵网络嵌入：Content-rich Network Embedding

内涵网络嵌入：Content-rich Network Embedding

我爱读PAMI

4+阅读 · 2019年11月5日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

基于PyTorch/TorchText的自然语言处理库

基于PyTorch/TorchText的自然语言处理库

专知

28+阅读 · 2019年4月22日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

无监督元学习表示学习

无监督元学习表示学习

CreateAMind

27+阅读 · 2019年1月4日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

《pyramid Attention Network for Semantic Segmentation》

《pyramid Attention Network for Semantic Segmentation》

统计学习与视觉计算组

44+阅读 · 2018年8月30日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

Multi-Label Text Classification using Attention-based Graph Neural Network

Arxiv

46+阅读 · 2020年3月22日

Text Level Graph Neural Network for Text Classification

Text Level Graph Neural Network for Text Classification

Arxiv

8+阅读 · 2019年10月6日

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

Arxiv

14+阅读 · 2019年8月8日

An End-to-End Baseline for Video Captioning

Arxiv

6+阅读 · 2019年4月4日

Deep Node Ranking: an Algorithm for Structural Network Embedding and End-to-End Classification

Deep Node Ranking: an Algorithm for Structural Network Embedding and End-to-End Classification

Arxiv

4+阅读 · 2019年2月11日

CoCoNet: A Collaborative Convolutional Network

CoCoNet: A Collaborative Convolutional Network

Arxiv

6+阅读 · 2019年1月28日

An Attention-Gated Convolutional Neural Network for Sentence Classification

An Attention-Gated Convolutional Neural Network for Sentence Classification

Arxiv

4+阅读 · 2018年12月28日

End-to-End Text Classification via Image-based Embedding using Character-level Networks

End-to-End Text Classification via Image-based Embedding using Character-level Networks

Arxiv

5+阅读 · 2018年10月10日

Graph Convolutional Networks for Text Classification

Arxiv

12+阅读 · 2018年9月15日

Hierarchical Pointer Memory Network for Task Oriented Dialogue

Arxiv

3+阅读 · 2018年5月3日

微信扫码咨询专知VIP会员