C-DLSI:针对联邦文本检索的扩展LSI (C-DLSI: An Extended LSI Tailored for Federated Text Retrieval) - 专知论文

会员服务 ·

0

INFORMS · 潜在语义索引 · IR · 可辨认的 · 查准率/准确率 ·

2018 年 10 月 5 日

C-DLSI: An Extended LSI Tailored for Federated Text Retrieval

翻译：C-DLSI:针对联邦文本检索的扩展LSI

Qijun Zhu,Dandan Li,Dik Lun Lee

As the web expands in data volume and in geographical distribution, centralized search methods become inefficient, leading to increasing interest in cooperative information retrieval, e.g., federated text retrieval (FTR). Different from existing centralized information retrieval (IR) methods, in which search is done on a logically centralized document collection, FTR is composed of a number of peers, each of which is a complete search engine by itself. To process a query, FTR requires firstly the identification of promising peers that host the relevant documents and secondly the retrieval of the most relevant documents from the selected peers. Most of the existing methods only apply traditional IR techniques that treat each text collection as a single large document and utilize term matching to rank the collections. In this paper, we formalize the problem and identify the properties of FTR, and analyze the feasibility of extending LSI with clustering to adapt to FTR, based on which a novel approach called Cluster-based Distributed Latent Semantic Indexing (C-DLSI) is proposed. C-DLSI distinguishes the topics of a peer with clustering, captures the local LSI spaces within the clusters, and consider the relations among these LSI spaces, thus providing more precise characterization of the peer. Accordingly, novel descriptors of the peers and a compatible local text retrieval are proposed. The experimental results show that C-DLSI outperforms existing methods.

翻译：随着网络在数据量和地理分布方面的扩展,集中搜索方法变得效率低下,导致对合作信息检索的兴趣日益浓厚,例如,联合文本检索。与现有的集中信息检索方法不同,现有集中信息检索方法在逻辑集中的文件收集上搜索,FTR由若干同行组成,每个同行都是完整的搜索引擎。为了处理查询,FTR首先要求确定主办相关文件的有希望的同行,其次是从选定的同行那里检索最相关的文件。大多数现有方法仅采用传统IR技术,将每个文本收藏作为单大文件处理,并使用术语匹配收藏品的排名。在本文件中,我们将问题正规化,确定FTR的特性,并分析扩大LSI的集群以适应FTR的可行性,并在此基础上建议采用新的基于集群的分散式Lett-Smant索引(C-DLSI)方法。C-DLSI将同行群集的主题与主题区分,捕捉到本地的LSI空间,利用匹配的术语来对收藏品进行排名。在本文件中,我们将问题加以正式化,并分析扩大LSI的分组的现有同行检索结果。

0

相关内容

INFORMS

《计算机信息》杂志发表高质量的论文，扩大了运筹学和计算的范围，寻求有关理论、方法、实验、系统和应用方面的原创研究论文、新颖的调查和教程论文，以及描述新的和有用的软件工具的论文。官网链接：https://pubsonline.informs.org/journal/ijoc

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

专知会员服务

44+阅读 · 2020年5月3日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【课程推荐】斯坦福课程：信息检索与网络搜索《CS 276: Information Retrieval and Web Search(Spring quarter 2019)》by Chris Manning, Pandu Nayak

【课程推荐】斯坦福课程：信息检索与网络搜索《CS 276: Information Retrieval and Web Search(Spring quarter 2019)》by Chris Manning, Pandu Nayak

专知会员服务

46+阅读 · 2019年12月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

知识图谱本体结构构建论文合集

知识图谱本体结构构建论文合集

专知会员服务

108+阅读 · 2019年10月9日

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Probability Weighted Compact Feature for Domain Adaptive Retrieval

Probability Weighted Compact Feature for Domain Adaptive Retrieval

Arxiv

4+阅读 · 2020年3月6日

Learning to Predict the Cosmological Structure Formation

Arxiv

3+阅读 · 2018年11月15日

Binary Constrained Deep Hashing Network for Image Retrieval without Manual Annotation

Binary Constrained Deep Hashing Network for Image Retrieval without Manual Annotation

Arxiv

3+阅读 · 2018年8月2日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning

Arxiv

4+阅读 · 2018年4月13日

Training a Ranking Function for Open-Domain Question Answering

Arxiv

5+阅读 · 2018年4月12日

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Arxiv

10+阅读 · 2018年3月29日

Large-Scale Image Retrieval with Attentive Deep Local Features

Arxiv

3+阅读 · 2018年2月3日

Content based video retrieval

Arxiv

3+阅读 · 2012年11月20日

VIP会员

文章信息

相关主题

潜在语义索引

查准率/准确率

相关VIP内容

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

【论文推荐】自然语言处理与查询扩展综述，Natural Language Processing and Query Expansion

专知会员服务

44+阅读 · 2020年5月3日

Python分布式计算，171页pdf，Distributed Computing with Python

Python分布式计算，171页pdf，Distributed Computing with Python

专知会员服务

108+阅读 · 2020年5月3日

【强化学习资源集合】Awesome Reinforcement Learning

【强化学习资源集合】Awesome Reinforcement Learning

专知会员服务

97+阅读 · 2019年12月23日

【课程推荐】斯坦福课程：信息检索与网络搜索《CS 276: Information Retrieval and Web Search(Spring quarter 2019)》by Chris Manning, Pandu Nayak

【课程推荐】斯坦福课程：信息检索与网络搜索《CS 276: Information Retrieval and Web Search(Spring quarter 2019)》by Chris Manning, Pandu Nayak

专知会员服务

46+阅读 · 2019年12月2日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【深度学习视频分析/多模态学习资源大列表】

【深度学习视频分析/多模态学习资源大列表】

专知会员服务

92+阅读 · 2019年10月16日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

机器学习入门的经验与建议

机器学习入门的经验与建议

专知会员服务

94+阅读 · 2019年10月10日

知识图谱本体结构构建论文合集

知识图谱本体结构构建论文合集

专知会员服务

108+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《战区安全决策课程体系》最新244页

《"无人机航母"原型平台》

任务规划与地形分析：现代复杂环境作战导航体系

《攻击场景描述形式化模型研究》

相关资讯

LibRec 精选：AutoML for Contextual Bandits

LibRec 精选：AutoML for Contextual Bandits

LibRec智能推荐

7+阅读 · 2019年9月19日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

IEEE | DSC 2019诚邀稿件 (EI检索)

IEEE | DSC 2019诚邀稿件 (EI检索)

Call4Papers

10+阅读 · 2019年2月25日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文】变分推断（Variational inference)的总结

【论文】变分推断（Variational inference)的总结

机器学习研究会

39+阅读 · 2017年11月16日

Auto-Encoding GAN

Auto-Encoding GAN

CreateAMind

7+阅读 · 2017年8月4日

相关论文

A survey on deep hashing for image retrieval

A survey on deep hashing for image retrieval

Arxiv

15+阅读 · 2020年6月10日

Probability Weighted Compact Feature for Domain Adaptive Retrieval

Probability Weighted Compact Feature for Domain Adaptive Retrieval

Arxiv

4+阅读 · 2020年3月6日

Learning to Predict the Cosmological Structure Formation

Arxiv

3+阅读 · 2018年11月15日

Binary Constrained Deep Hashing Network for Image Retrieval without Manual Annotation

Binary Constrained Deep Hashing Network for Image Retrieval without Manual Annotation

Arxiv

3+阅读 · 2018年8月2日

Dialog-based Interactive Image Retrieval

Arxiv

5+阅读 · 2018年5月1日

Text-to-Clip Video Retrieval with Early Fusion and Re-Captioning

Arxiv

4+阅读 · 2018年4月13日

Training a Ranking Function for Open-Domain Question Answering

Arxiv

5+阅读 · 2018年4月12日

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking

Arxiv

10+阅读 · 2018年3月29日

Large-Scale Image Retrieval with Attentive Deep Local Features

Arxiv

3+阅读 · 2018年2月3日

Content based video retrieval

Arxiv

3+阅读 · 2012年11月20日

微信扫码咨询专知VIP会员