Large Language Models have seen expanding application across domains, yet their effectiveness as assistive tools for scientific writing -- an endeavor requiring precision, multimodal synthesis, and domain expertise -- remains insufficiently understood. We examine the potential of LLMs to support domain experts in scientific writing, with a focus on abstract composition. We design an incentivized randomized controlled trial with a hypothetical conference setup where participants with relevant expertise are split into an author and reviewer pool. Inspired by methods in behavioral science, our novel incentive structure encourages authors to edit the provided abstracts to an acceptable quality for a peer-reviewed submission. Our 2x2 between-subject design expands into two dimensions: the implicit source of the provided abstract and the disclosure of it. We find authors make most edits when editing human-written abstracts compared to AI-generated abstracts without source attribution, often guided by higher perceived readability in AI generation. Upon disclosure of source information, the volume of edits converges in both source treatments. Reviewer decisions remain unaffected by the source of the abstract, but bear a significant correlation with the number of edits made. Careful stylistic edits, especially in the case of AI-generated abstracts, in the presence of source information, improve the chance of acceptance. We find that AI-generated abstracts hold potential to reach comparable levels of acceptability to human-written ones with minimal revision, and that perceptions of AI authorship, rather than objective quality, drive much of the observed editing behavior. Our findings reverberate the significance of source disclosure in collaborative scientific writing.
翻译:大型语言模型已在多个领域得到广泛应用,但其作为科学写作辅助工具的有效性——这一任务需要精确性、多模态综合与领域专业知识——仍未得到充分理解。本研究探讨了LLMs支持领域专家进行科学写作的潜力,重点关注摘要撰写。我们设计了一项激励性随机对照试验,采用假设的会议设置,将具有相关专业知识的参与者分为作者组和评审组。受行为科学方法启发,我们新颖的激励结构鼓励作者将提供的摘要编辑至符合同行评审投稿要求的质量水平。我们的2x2被试间设计拓展至两个维度:所提供摘要的隐含来源及其披露情况。研究发现,与未标注来源的AI生成摘要相比,作者在编辑人工撰写的摘要时修改最多,这通常受AI生成文本较高可读性感知的引导。当披露来源信息后,两种来源处理的编辑量趋于一致。评审决策不受摘要来源影响,但与修改数量呈显著相关。在存在来源信息的情况下,细致的风格化修改(尤其是针对AI生成摘要)能提高录用概率。我们发现,AI生成的摘要经最小修订后,有望达到与人工撰写摘要相当的接受度,且对AI作者身份的感知(而非客观质量)驱动了大部分观察到的编辑行为。我们的研究结果印证了来源披露在协作式科学写作中的重要性。