我们能自动进行科学审查吗? (Can We Automate Scientific Reviewing?)

The rapid development of science and technology has been accompanied by an exponential growth in peer-reviewed scientific publications. At the same time, the review of each paper is a laborious process that must be carried out by subject matter experts. Thus, providing high-quality reviews of this growing number of papers is a significant challenge. In this work, we ask the question "can we automate scientific reviewing?", discussing the possibility of using state-of-the-art natural language processing (NLP) models to generate first-pass peer reviews for scientific papers. Arguably the most difficult part of this is defining what a "good" review is in the first place, so we first discuss possible evaluation measures for such reviews. We then collect a dataset of papers in the machine learning domain, annotate them with different aspects of content covered in each review, and train targeted summarization models that take in papers to generate reviews. Comprehensive experimental results show that system-generated reviews tend to touch upon more aspects of the paper than human-written reviews, but the generated text can suffer from lower constructiveness for all aspects except the explanation of the core ideas of the papers, which are largely factually correct. We finally summarize eight challenges in the pursuit of a good review generation system together with potential solutions, which, hopefully, will inspire more future research on this subject. We make all code, and the dataset publicly available: https://github.com/neulab/ReviewAdvisor, as well as a ReviewAdvisor system: http://review.nlpedia.ai/.

翻译：科技的快速发展伴随着经同行评审的科学出版物的快速增长。同时,对每份文件的审查是一个艰巨的过程,必须由专题专家进行。因此,对越来越多的论文进行高质量的审查是一项重大挑战。在这项工作中,我们询问“我们能否实现科学审查的自动化? ”, 讨论利用最新自然语言处理(NLP)模式为科学论文产生第一流的同行审议的可能性。可以说,其中最困难的部分是确定“良好”审查首先是什么,因此我们首先讨论这类审查可能的评价措施。然后,我们在机器学习领域收集一套文件的数据集,说明每份审查涉及的内容的不同方面,并培训有针对性的总结模型,在文件中进行评论。全面的实验结果显示,系统生成的审查往往触及文件的更多方面,而不是人文评论,但所产生的案文在各个方面的建设性程度较低,除了解释文件的核心想法之外,我们首先讨论此类审查可能采取的评估措施。我们最后要对数据进行事实性评估,我们最后要对8个数据进行事实性评估。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【RLChina2020公开课】Lecture-11.pdf【多智能体学习与游戏AI前沿】

专知会员服务

27+阅读 · 2020年8月6日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

80+阅读 · 2020年7月26日

经济学中的数据科学，Data Science in Economics，附22页pdf

专知会员服务

36+阅读 · 2020年4月1日

美国DARPA204页可解释人工智能文献综述论文《Explanation in Human-AI Systems》

专知会员服务

97+阅读 · 2020年1月9日