Prior work on pretrained sentence embeddings and benchmarks focus on the capabilities of stand-alone sentences. We propose DiscoEval, a test suite of tasks to evaluate whether sentence representations include broader context information. We also propose a variety of training objectives that makes use of natural annotations from Wikipedia to build sentence encoders capable of modeling discourse. We benchmark sentence encoders pretrained with our proposed training objectives, as well as other popular pretrained sentence encoders on DiscoEval and other sentence evaluation tasks. Empirically, we show that these training objectives help to encode different aspects of information in document structures. Moreover, BERT and ELMo demonstrate strong performances over DiscoEval with individual hidden layers showing different characteristics.
翻译:我们提议DiscoEval(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval))(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval))(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval)(DiscoEval) (DiscoEval) (DiscoEval) (DiscoEval) (DiscoEval) (DiscoEval) (DiscoEval) (DiscoEval) (DiscoEval) ) (Dreal) (DiscoEval) (Dreal) ) (一个测试系列任务) (DiscoEval) (一个测试,这是一套任务,用以评价一个测试一组任务,用以评价刑罚是否包括更广泛的背景信息。我们还提出一系列任务,用以评估文件结构中包含内容。我们提议一系列任务。我们还提出一系列任务,用以评价一系列任务,我们建议一系列任务,用以评估这些任务,用以评估这些任务,用以评估这些训练目的) 用来算算算算算算算算算算算算算算法),用以算算算算法),用以算算法),用以算算出文件结构中的资料。此外,我们还提出各种资料结构。此外,我们还提出各种资料。此外,我们还提出各种资料结构结构结构结构。此外,我们建议,我们建议。此外,我们建议。此外,我们还提出许多种。此外,我们还提议。