This competition investigates the performance of large-scale retrieval of historical document images based on writing style. Based on large image data sets provided by cultural heritage institutions and digital libraries, providing a total of 20 000 document images representing about 10 000 writers, divided in three types: writers of (i) manuscript books, (ii) letters, (iii) charters and legal documents. We focus on the task of automatic image retrieval to simulate common scenarios of humanities research, such as writer retrieval. The most teams submitted traditional methods not using deep learning techniques. The competition results show that a combination of methods is outperforming single methods. Furthermore, letters are much more difficult to retrieve than manuscripts.
翻译:这一竞赛调查了基于书写风格大规模检索历史文件图像的绩效。根据文化遗产机构和数字图书馆提供的大型图像数据集,总共提供了代表大约10 000名作家的20 000张文件图像,分为三类:(一) 手稿书,(二) 字母,(三) 章程和法律文件的作者。我们侧重于自动图像检索的任务,以模拟人文研究的共同情景,如作家检索。大多数团队提交了传统方法,但没有使用深层学习技术。竞争结果显示,各种方法的组合优于单一方法。此外,字母比手稿更难检索。