先前对话中用于谈话的元素采矿 (Intent Mining from past conversations for conversational agent) - 专知论文

会员服务 ·

0

簇 · 会话智能体 · MINE · 标注 · MoDELS ·

2021 年 1 月 18 日

Intent Mining from past conversations for conversational agent

翻译：先前对话中用于谈话的元素采矿

Ajay Chatterjee,Shubhashis Sengupta

from arxiv, 8 pages, 2 figures

Conversational systems are of primary interest in the AI community. Chatbots are increasingly being deployed to provide round-the-clock support and to increase customer engagement. Many of the commercial bot building frameworks follow a standard approach that requires one to build and train an intent model to recognize a user input. Intent models are trained in a supervised setting with a collection of textual utterance and intent label pairs. Gathering a substantial and wide coverage of training data for different intent is a bottleneck in the bot building process. Moreover, the cost of labeling a hundred to thousands of conversations with intent is a time consuming and laborious job. In this paper, we present an intent discovery framework that involves 4 primary steps: Extraction of textual utterances from a conversation using a pre-trained domain agnostic Dialog Act Classifier (Data Extraction), automatic clustering of similar user utterances (Clustering), manual annotation of clusters with an intent label (Labeling) and propagation of intent labels to the utterances from the previous step, which are not mapped to any cluster (Label Propagation); to generate intent training data from raw conversations. We have introduced a novel density-based clustering algorithm ITER-DBSCAN for unbalanced data clustering. Subject Matter Expert (Annotators with domain expertise) manually looks into the clustered user utterances and provides an intent label for discovery. We conducted user studies to validate the effectiveness of the trained intent model generated in terms of coverage of intents, accuracy and time saving concerning manual annotation. Although the system is developed for building an intent model for the conversational system, this framework can also be used for a short text clustering or as a labeling framework.

翻译：互换系统是AI 社区的主要兴趣所在。聊天室正在越来越多地部署, 以提供全天候支持和增加客户参与。许多商用机器人建筑框架都遵循标准方法, 要求建立和训练一种识别用户投入的意向模型。意向模型在监督环境下经过培训, 收集了文本表达和意向标签配对。为不同目的收集大量和广泛的培训数据是机器人建设过程中的一个瓶颈。此外, 标注成百上千次有意对话的成本是耗时和艰苦的专长。在本文中, 我们展示了一个意向发现框架, 包括4个主要步骤: 使用预先培训过的域域描述分析法案分类(DataGripationon), 将类似的用户表达(Clustering)自动组合, 以模型标签( Weabiling) 和意图标签框架传播到前一步骤的口述, 用于构建任何意向的意向准确度(Label Propaggation), 我们展示了一个用于数据库的系统, 将目标数据升级的版本用于数据库。

0

相关内容

【EMNLP2020】开放领域对话的数据增广的方法：“对话蒸馏”

【EMNLP2020】开放领域对话的数据增广的方法：“对话蒸馏”

专知会员服务

30+阅读 · 2020年9月29日

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

专知会员服务

90+阅读 · 2020年7月9日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

专知会员服务

117+阅读 · 2020年4月3日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

专知会员服务

29+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2018年1月19日

Data Augmentation for Spoken Language Understanding via Pretrained Language Models

Arxiv

0+阅读 · 2021年3月11日

A Framework for Generating Explanations from Temporal Personal Health Data

Arxiv

0+阅读 · 2021年3月10日

Advances and Challenges in Conversational Recommender Systems: A Survey

Arxiv

14+阅读 · 2021年1月23日

Towards Topic-Guided Conversational Recommender System

Towards Topic-Guided Conversational Recommender System

Arxiv

4+阅读 · 2020年11月2日

Learning to Infer User Hidden States for Online Sequential Advertising

Arxiv

9+阅读 · 2020年9月3日

Query Understanding via Intent Description Generation

Arxiv

9+阅读 · 2020年8月25日

Conversational Machine Comprehension: a Literature Review

Arxiv

3+阅读 · 2020年6月1日

DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation

DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation

Arxiv

5+阅读 · 2019年11月1日

DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation

DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation

Arxiv

8+阅读 · 2019年8月30日

Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

Arxiv

6+阅读 · 2019年1月16日

VIP会员

文章信息

相关主题

会话智能体

相关VIP内容

【EMNLP2020】开放领域对话的数据增广的方法：“对话蒸馏”

【EMNLP2020】开放领域对话的数据增广的方法：“对话蒸馏”

专知会员服务

30+阅读 · 2020年9月29日

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

【KDD2020】基于知识图谱的语义融合改进会话推荐系统，Improving Conversational Recommender Systems via Knowledge Graph based Semantic Fusion

专知会员服务

90+阅读 · 2020年7月9日

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

【干货书】真实机器学习，264页pdf，Real-World Machine Learning

专知会员服务

115+阅读 · 2020年4月5日

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

专知会员服务

117+阅读 · 2020年4月3日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

28+阅读 · 2019年11月8日

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

【清华大学-微软研究院】构建智能开放域对话系统的挑战综述论文，31页pdf，Challenges in Building Intelligent Open-domain Dialog Systems

专知会员服务

29+阅读 · 2019年11月2日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【NeurIPS2025】Seg4Diff：揭示文本到图像扩散 Transformer 中的开放词汇分割

【NeurIPS2025】DNA-DetectLLM：基于 DNA 启发的“突变-修复”范式揭示 AI 生成文本

【NTU博士论文】让语言模型成为更类人的学习者

强化学习遇见大语言模型：贯穿 LLM 生命周期的进展与应用综述

相关资讯

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

已删除

将门创投

4+阅读 · 2018年1月19日

相关论文

Data Augmentation for Spoken Language Understanding via Pretrained Language Models

Arxiv

0+阅读 · 2021年3月11日

A Framework for Generating Explanations from Temporal Personal Health Data

Arxiv

0+阅读 · 2021年3月10日

Advances and Challenges in Conversational Recommender Systems: A Survey

Arxiv

14+阅读 · 2021年1月23日

Towards Topic-Guided Conversational Recommender System

Towards Topic-Guided Conversational Recommender System

Arxiv

4+阅读 · 2020年11月2日

Learning to Infer User Hidden States for Online Sequential Advertising

Arxiv

9+阅读 · 2020年9月3日

Query Understanding via Intent Description Generation

Arxiv

9+阅读 · 2020年8月25日

Conversational Machine Comprehension: a Literature Review

Arxiv

3+阅读 · 2020年6月1日

DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation

DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation

Arxiv

5+阅读 · 2019年11月1日

DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation

DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation

Arxiv

8+阅读 · 2019年8月30日

Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

Arxiv

6+阅读 · 2019年1月16日

微信扫码咨询专知VIP会员