NLP systems typically require support for more than one language. As different languages have different amounts of supervision, cross-lingual transfer benefits languages with little to no training data by transferring from other languages. From an engineering perspective, multilingual NLP benefits development and maintenance by serving multiple languages with a single system. Both cross-lingual transfer and multilingual NLP rely on cross-lingual representations serving as the foundation. As BERT revolutionized representation learning and NLP, it also revolutionized cross-lingual representations and cross-lingual transfer. Multilingual BERT was released as a replacement for single-language BERT, trained with Wikipedia data in 104 languages. Surprisingly, without any explicit cross-lingual signal, multilingual BERT learns cross-lingual representations in addition to representations for individual languages. This thesis first shows such surprising cross-lingual effectiveness compared against prior art on various tasks. Naturally, it raises a set of questions, most notably how do these multilingual encoders learn cross-lingual representations. In exploring these questions, this thesis will analyze the behavior of multilingual models in a variety of settings on high and low resource languages. We also look at how to inject different cross-lingual signals into multilingual encoders, and the optimization behavior of cross-lingual transfer with these models. Together, they provide a better understanding of multilingual encoders on cross-lingual transfer. Our findings will lead us to suggested improvements to multilingual encoders and cross-lingual transfer.
翻译:不同语言通常需要一种以上语言的支持。由于不同语言有不同程度的监督,跨语言的转移有利于跨语言的转移,从其他语言的传输数据很少到没有培训数据。从工程的角度来看,多语言的转移和多语言的NLP都通过单一系统为多种语言服务而有利于发展和维护。跨语言的转移和多语言的NLP系统都依靠跨语言的表述作为基础。作为BERT革命化的代表学习和NLP,它也使跨语言的表达方式和跨语言的转移发生革命性变化。多语言的BERT被发布为单一语言的BERT的替代,用104种语言进行维基百科数据培训。令人惊讶的是,在没有明确的跨语言信号的情况下,多语言的BERT除了单个语言的表达方式外,还学习跨语言的表达方式的表达方式。这首先显示了与以前各种任务的艺术相比如此令人惊讶的跨语言表达方式的有效性。自然,它提出了一系列问题,最明显的是,这些多语言的解说者如何学习跨语言的表达。在高低语言语言语言语言语言语言语言语言语言语言语言语言语言上的各种模式的转变中,我们建议了对多语言的多语言的跨语言模式进行更好的理解。