COUGH: CCOVID-19 FAQ检索的挑战数据集和模型 (COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval) - 专知论文

会员服务 ·

0

COVID-19 · 数据集 · MoDELS · BM25 · 情景 ·

2021 年 9 月 10 日

COUGH: A Challenge Dataset and Models for COVID-19 FAQ Retrieval

翻译：COUGH: CCOVID-19 FAQ检索的挑战数据集和模型

Xinliang Frederick Zhang,Heming Sun,Xiang Yue,Simon Lin,Huan Sun

from arxiv, EMNLP'21 Main Conference

We present a large, challenging dataset, COUGH, for COVID-19 FAQ retrieval. Similar to a standard FAQ dataset, COUGH consists of three parts: FAQ Bank, Query Bank and Relevance Set. The FAQ Bank contains ~16K FAQ items scraped from 55 credible websites (e.g., CDC and WHO). For evaluation, we introduce Query Bank and Relevance Set, where the former contains 1,236 human-paraphrased queries while the latter contains ~32 human-annotated FAQ items for each query. We analyze COUGH by testing different FAQ retrieval models built on top of BM25 and BERT, among which the best model achieves 48.8 under P@5, indicating a great challenge presented by COUGH and encouraging future research for further improvement. Our COUGH dataset is available at https://github.com/sunlab-osu/covid-faq.

翻译：我们为COVID-19 FAQ检索提供了一个庞大的、具有挑战性的数据集,COUGH。类似于标准的FAQ数据集,COUGH由三部分组成:FAQ Bank、Query Bank and International Set。FAQ Bank 包含从55个可信的网站(如CDC和WHO)中剪掉的~16K FAQ项目。为了评估,我们引入了Query Bank and Internity Set, 前者包含1,236个人类口号查询,而后者包含每份查询的~32个人类附加说明的FAQ项目。我们通过测试建在BM25和BERT上方的FAQ检索模型来分析COUGH,其中最佳模型在P@5下达到48.8,表明COUGH提出了巨大的挑战,并鼓励今后的研究进一步改进。我们的COUGH数据集可在https://github.com/sunlab-osu/covid-faq查阅。

0

相关内容

COVID-19

【CIKM2021】基于检索的个性化聊天机器人模型IMPChat

专知会员服务

15+阅读 · 2021年8月25日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

66+阅读 · 2020年7月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

23+阅读 · 2019年12月15日

【NLP| 推荐文章】基于文本和知识库的语义搜索（Semantic search on text and knowledge bases）

专知会员服务

43+阅读 · 2019年11月24日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

44+阅读 · 2019年11月24日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

25+阅读 · 2019年11月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

24+阅读 · 2019年10月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

144+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

Keras实例：PointNet点云分类

Keras实例：PointNet点云分类

专知

6+阅读 · 2020年5月30日

牛逼！深度学习又添新框架，来自Facebook 【Pythia】

牛逼！深度学习又添新框架，来自Facebook 【Pythia】

机器学习算法与Python学习

7+阅读 · 2019年6月25日

用 TensorFlow hub 在 Keras 中做 ELMo 嵌入

用 TensorFlow hub 在 Keras 中做 ELMo 嵌入

AI研习社

5+阅读 · 2019年5月12日

对话系统近期进展

对话系统近期进展

专知

37+阅读 · 2019年3月23日

Github项目推荐 | awesome-bert：BERT相关资源大列表

Github项目推荐 | awesome-bert：BERT相关资源大列表

AI研习社

27+阅读 · 2019年2月26日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

CoQA！斯坦福召开新一轮QA比赛，剑指对话问答！

CoQA！斯坦福召开新一轮QA比赛，剑指对话问答！

专知

6+阅读 · 2018年8月23日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

LibRec 精选：推荐系统9个必备数据集

LibRec 精选：推荐系统9个必备数据集

LibRec智能推荐

6+阅读 · 2018年3月7日

Adversarial Retriever-Ranker for dense text retrieval

Arxiv

0+阅读 · 2021年10月29日

Dense Hierarchical Retrieval for Open-Domain Question Answering

Arxiv

0+阅读 · 2021年10月28日

RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking

Arxiv

4+阅读 · 2021年10月14日

Multi-Modal Answer Validation for Knowledge-Based VQA

Arxiv

6+阅读 · 2021年3月23日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

Web Table Extraction, Retrieval and Augmentation: A Survey

Arxiv

7+阅读 · 2020年2月5日

CoQA: A Conversational Question Answering Challenge

CoQA: A Conversational Question Answering Challenge

Arxiv

7+阅读 · 2018年8月21日

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

Arxiv

7+阅读 · 2018年6月12日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning

Arxiv

4+阅读 · 2018年4月13日

VIP会员

文章信息

相关主题

相关VIP内容

【CIKM2021】基于检索的个性化聊天机器人模型IMPChat

专知会员服务

15+阅读 · 2021年8月25日

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

最新《知识图谱复杂问答》综述论文，A Survey on Complex Question Answering over Knowledge Base: Recent Advances and Challenges

专知会员服务

66+阅读 · 2020年7月28日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

52+阅读 · 2020年1月30日

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

【论文推荐】小样本视频合成，Few-shot Video-to-Video Synthesis

专知会员服务

23+阅读 · 2019年12月15日

【NLP| 推荐文章】基于文本和知识库的语义搜索（Semantic search on text and knowledge bases）

专知会员服务

43+阅读 · 2019年11月24日

【NLP| 推荐文章】基于知识库的问答系统关键技术综述（Core techniques of question answering systems over knowledge bases：a survey）

专知会员服务

44+阅读 · 2019年11月24日

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

微软发布DialoGPT预训练语言模型，论文与代码 Large-Scale Generative Pre-training for Conversational Response Generation

专知会员服务

25+阅读 · 2019年11月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

24+阅读 · 2019年10月18日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

144+阅读 · 2019年10月12日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

热门VIP内容

相关资讯

【资源】问答阅读理解资源列表

【资源】问答阅读理解资源列表

专知

3+阅读 · 2020年7月25日

Keras实例：PointNet点云分类

Keras实例：PointNet点云分类

专知

6+阅读 · 2020年5月30日

牛逼！深度学习又添新框架，来自Facebook 【Pythia】

牛逼！深度学习又添新框架，来自Facebook 【Pythia】

机器学习算法与Python学习

7+阅读 · 2019年6月25日

用 TensorFlow hub 在 Keras 中做 ELMo 嵌入

用 TensorFlow hub 在 Keras 中做 ELMo 嵌入

AI研习社

5+阅读 · 2019年5月12日

对话系统近期进展

对话系统近期进展

专知

37+阅读 · 2019年3月23日

Github项目推荐 | awesome-bert：BERT相关资源大列表

Github项目推荐 | awesome-bert：BERT相关资源大列表

AI研习社

27+阅读 · 2019年2月26日

LibRec 精选：推荐系统的常用数据集

LibRec 精选：推荐系统的常用数据集

LibRec智能推荐

17+阅读 · 2019年2月15日

CoQA！斯坦福召开新一轮QA比赛，剑指对话问答！

CoQA！斯坦福召开新一轮QA比赛，剑指对话问答！

专知

6+阅读 · 2018年8月23日

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

【论文推荐】最新六篇序列推荐相关论文—卷积序列嵌入学习、用户记忆网络、上下文GRU、迁移学习

专知

10+阅读 · 2018年4月8日

LibRec 精选：推荐系统9个必备数据集

LibRec 精选：推荐系统9个必备数据集

LibRec智能推荐

6+阅读 · 2018年3月7日

相关论文

Adversarial Retriever-Ranker for dense text retrieval

Arxiv

0+阅读 · 2021年10月29日

Dense Hierarchical Retrieval for Open-Domain Question Answering

Arxiv

0+阅读 · 2021年10月28日

RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking

Arxiv

4+阅读 · 2021年10月14日

Multi-Modal Answer Validation for Knowledge-Based VQA

Arxiv

6+阅读 · 2021年3月23日

Embedding-based Retrieval in Facebook Search

Arxiv

12+阅读 · 2020年6月20日

Web Table Extraction, Retrieval and Augmentation: A Survey

Arxiv

7+阅读 · 2020年2月5日

CoQA: A Conversational Question Answering Challenge

CoQA: A Conversational Question Answering Challenge

Arxiv

7+阅读 · 2018年8月21日

Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering

Arxiv

7+阅读 · 2018年6月12日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning

Arxiv

4+阅读 · 2018年4月13日

微信扫码咨询专知VIP会员