南韓 Naver 實驗室歐洲分部 (SPLADE) 參與 TREC NeuCLIR 2022 (Naver Labs Europe (SPLADE) @ TREC NeuCLIR 2022) - 专知论文

会员服务 ·

0

MSMARCO · ColBERT · MoDELS · 矩 · Analysis ·

2023 年 3 月 10 日

Naver Labs Europe (SPLADE) @ TREC NeuCLIR 2022

翻译：南韓 Naver 實驗室歐洲分部 (SPLADE) 參與 TREC NeuCLIR 2022

Carlos Lassance,Stéphane Clinchant

from arxiv, Notebook detailing our participation and analysis on the TREC NeuCLIR 2022

This paper describes our participation in the 2022 TREC NeuCLIR challenge. We submitted runs to two out of the three languages (Farsi and Russian), with a focus on first-stage rankers and comparing mono-lingual strategies to Adhoc ones. For monolingual runs, we start from pretraining models on the target language using MLM+FLOPS and then finetuning using the MSMARCO translated to the language either with ColBERT or SPLADE as the retrieval model. While for the Adhoc task, we test both query translation (to the target language) and back-translation of the documents (to English). Initial result analysis shows that the monolingual strategy is strong, but that for the moment Adhoc achieved the best results, with back-translating documents being better than translating queries.

翻译：本文介紹了我們參與 2022 TREC NeuCLIR 挑戰賽的相關工作。我們針對其中的 Farsi 與 Russian 兩種語言提交了結果，主要關注一階段排名器，並比較單語和自適應方案之間的策略性差異。對於單語運行，我們通過 MLM+FLOPS 在目標語言上預訓練模型，然後使用 ColBERT 或 SPLADE 作為檢索模型，使用 MSMARCO 對語言進行微調。對於自適應任務，我們測試了查詢翻譯（到目標語言）和文檔反向翻譯（到英語）兩種方法。初步的結果分析顯示，單語策略效果較好，但目前自適應方法取得了最佳成果，其中文檔的反向翻譯效果優於查詢翻譯。

0

相关内容

MSMARCO

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

69+阅读 · 2023年3月31日

【AAAI2023】DPText-DETR: 基于动态点query的场景文本检测，更高更快更鲁棒

【AAAI2023】DPText-DETR: 基于动态点query的场景文本检测，更高更快更鲁棒

专知会员服务

16+阅读 · 2023年1月23日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

48+阅读 · 2022年10月2日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

85+阅读 · 2020年12月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

123+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

42+阅读 · 2020年1月28日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

68+阅读 · 2020年1月2日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

42+阅读 · 2019年11月25日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

11+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

专知

33+阅读 · 2018年4月23日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

密文图像中的可逆信息隐藏

国家自然科学基金

1+阅读 · 2014年12月31日

SIRT1-FOXO1-TLR4-NF-kappaB信号级联反应在白藜芦醇抗骨性关节炎中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Rugate薄膜的高功率激光非聚焦型空间低通滤波技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于HIF-1α信号途径研究硫化氢对缺氧诱导Aβ生成和聚积的抑制作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

S1P联合PR-MSCs移植在治疗小鼠急性心肌梗死中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧诱导因子1α在缺氧诱导的肝癌上皮细胞间质化及索拉非尼耐药中的作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

脊髓损伤后HMGB1蛋白激活炎症因子的释放及作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

HIF-1alpha介导凝血酶引起SAH后脑损伤的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

What is the best recipe for character-level encoder-only modelling?

Arxiv

0+阅读 · 2023年5月9日

Computing Bayes Nash Equilibrium Strategies in Auction Games via Simultaneous Online Dual Averaging

Arxiv

0+阅读 · 2023年5月9日

Rudolf Christoph Eucken at SemEval-2023 Task 4: An Ensemble Approach for Identifying Human Values from Arguments

Arxiv

0+阅读 · 2023年5月9日

A Large and Diverse Arabic Corpus for Language Modeling

Arxiv

0+阅读 · 2023年5月8日

MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset

Arxiv

0+阅读 · 2023年5月8日

Two-Stage Grasping: A New Bin Picking Framework for Small Objects

Arxiv

0+阅读 · 2023年5月6日

Verifiable Learning for Robust Tree Ensembles

Arxiv

0+阅读 · 2023年5月5日

The effect of conference presentations on the diffusion of ideas

Arxiv

0+阅读 · 2023年5月5日

Investigating Lexical Sharing in Multilingual Machine Translation for Indian Languages

Arxiv

0+阅读 · 2023年5月4日

A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond

Arxiv

10+阅读 · 2022年7月30日

VIP会员

文章信息

相关主题

相关VIP内容

百篇论文纵览大型语言模型最新研究进展

百篇论文纵览大型语言模型最新研究进展

专知会员服务

69+阅读 · 2023年3月31日

【AAAI2023】DPText-DETR: 基于动态点query的场景文本检测，更高更快更鲁棒

【AAAI2023】DPText-DETR: 基于动态点query的场景文本检测，更高更快更鲁棒

专知会员服务

16+阅读 · 2023年1月23日

NeurlPS 2022 | 自然语言处理相关论文分类整理

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

48+阅读 · 2022年10月2日

最新《自监督表示学习》报告，70页ppt

最新《自监督表示学习》报告，70页ppt

专知会员服务

85+阅读 · 2020年12月22日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

123+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

161+阅读 · 2020年3月18日

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

【微软研究院】IMAGEBERT: CROSS-MODAL PRE-TRAINING WITH LARGE-SCALE WEAK-SUPERVISED IMAGE-TEXT DATA

专知会员服务

42+阅读 · 2020年1月28日

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

2019年自然语言处理NLP亮点总结，29页pdf，NLP Year in Review — 2019 NLP highlights for the year 2019.

专知会员服务

68+阅读 · 2020年1月2日

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

【NLP模型的跨语言/跨领域迁移】《Transferring NLP models across languages and domains》

专知会员服务

42+阅读 · 2019年11月25日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

77+阅读 · 2019年10月10日

热门VIP内容

相关资讯

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

征稿 | International Joint Conference on Knowledge Graphs (IJCKG)

开放知识图谱

2+阅读 · 2022年5月20日

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

RoBERTa for Chinese：大规模中文预训练RoBERTa模型

AINLP

30+阅读 · 2019年9月8日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

23+阅读 · 2019年5月22日

ICLR2019最佳论文出炉

ICLR2019最佳论文出炉

专知

11+阅读 · 2019年5月6日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

41+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

16+阅读 · 2018年12月24日

Hierarchical Imitation - Reinforcement Learning

Hierarchical Imitation - Reinforcement Learning

CreateAMind

19+阅读 · 2018年5月25日

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

【论文推荐】最新八篇生成对抗网络相关论文—BRE、图像合成、多模态图像生成、非配对多域图、注意力、对抗特征增强、深度对抗性训练

专知

16+阅读 · 2018年5月14日

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

【论文推荐】最新八篇图像检索相关论文—三元组、深度特征图、判别式、卷积特征聚合、视觉-关系知识图谱、大规模图像检索

专知

33+阅读 · 2018年4月23日

【推荐】自然语言处理（NLP）指南

【推荐】自然语言处理（NLP）指南

机器学习研究会

35+阅读 · 2017年11月17日

相关论文

What is the best recipe for character-level encoder-only modelling?

Arxiv

0+阅读 · 2023年5月9日

Computing Bayes Nash Equilibrium Strategies in Auction Games via Simultaneous Online Dual Averaging

Arxiv

0+阅读 · 2023年5月9日

Rudolf Christoph Eucken at SemEval-2023 Task 4: An Ensemble Approach for Identifying Human Values from Arguments

Arxiv

0+阅读 · 2023年5月9日

A Large and Diverse Arabic Corpus for Language Modeling

Arxiv

0+阅读 · 2023年5月8日

MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset

Arxiv

0+阅读 · 2023年5月8日

Two-Stage Grasping: A New Bin Picking Framework for Small Objects

Arxiv

0+阅读 · 2023年5月6日

Verifiable Learning for Robust Tree Ensembles

Arxiv

0+阅读 · 2023年5月5日

The effect of conference presentations on the diffusion of ideas

Arxiv

0+阅读 · 2023年5月5日

Investigating Lexical Sharing in Multilingual Machine Translation for Indian Languages

Arxiv

0+阅读 · 2023年5月4日

A Survey on Masked Autoencoder for Self-supervised Learning in Vision and Beyond

Arxiv

10+阅读 · 2022年7月30日

相关基金

Schr？dinger-Poisson方程守恒DDG方法研究

国家自然科学基金

0+阅读 · 2015年12月31日

密文图像中的可逆信息隐藏

国家自然科学基金

1+阅读 · 2014年12月31日

SIRT1-FOXO1-TLR4-NF-kappaB信号级联反应在白藜芦醇抗骨性关节炎中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

基于Rugate薄膜的高功率激光非聚焦型空间低通滤波技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于HIF-1α信号途径研究硫化氢对缺氧诱导Aβ生成和聚积的抑制作用及机制

国家自然科学基金

0+阅读 · 2012年12月31日

S1P联合PR-MSCs移植在治疗小鼠急性心肌梗死中的作用

国家自然科学基金

0+阅读 · 2012年12月31日

缺氧诱导因子1α在缺氧诱导的肝癌上皮细胞间质化及索拉非尼耐药中的作用及其机制研究

国家自然科学基金

0+阅读 · 2012年12月31日

函数域中的Vinogradov中值定理

国家自然科学基金

0+阅读 · 2012年12月31日

脊髓损伤后HMGB1蛋白激活炎症因子的释放及作用机理研究

国家自然科学基金

0+阅读 · 2012年12月31日

HIF-1alpha介导凝血酶引起SAH后脑损伤的相关研究

国家自然科学基金

0+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员