大语言示范守则助理的安全影响:用户研究 (Security Implications of Large Language Model Code Assistants: A User Study) - 专知论文

会员服务 ·

0

代码 · 语言模型化 · MoDELS · 操作 · Integration ·

2022 年 10 月 3 日

Security Implications of Large Language Model Code Assistants: A User Study

翻译：大语言示范守则助理的安全影响:用户研究

Gustavo Sandoval,Hammond Pearce,Teo Nys,Ramesh Karri,Brendan Dolan-Gavitt,Siddharth Garg

from arxiv, 16 pages, 13 figures

Advances in Deep Learning have led to the emergence of Large Language Models (LLMs) such as OpenAI Codex which powers GitHub Copilot. LLMs have been fine tuned and packaged so that programmers can use them in an Integrated Development Environment (IDE) to write code. An emerging line of work is assessing the code quality of code written with the help of these LLMs, with security studies warning that LLMs do not fundamentally have any understanding of the code they are writing, so they are more likely to make mistakes that may be exploitable. We thus conducted a user study (N=58) to assess the security of code written by student programmers when guided by LLMs. Half of the students in our study had the help of the LLM and the other half did not. The students were asked to write code in C that performed operations over a singly linked list, including node operations such as inserting, updating, removing, combining, and others. Our study shows that the students who had the help of an LLM were more likely to write functional code, and although security impacts were observed for certain individual functions, no statistically significant impact on security was observed across all functions. We also investigate systematic stylistic differences between unaided and LLM-assisted code, finding that LLM code is more repetitive, which may have an amplifying effect if vulnerable code is repeated in addition to the impact on source code attribution.

翻译：深层学习的进展导致了大型语言模型(LLMs)的出现,如OpenAI Codex(GitHub Copilot权力的GitHub Copilation)。LLMs经过了细微的调整和包装,使程序员能够在综合开发环境中(IDE)使用它们来写代码。正在出现的工作路线是评估在这些LMs的帮助下编写的代码的代码的代码质量,安全研究警告LLMs根本上不理解他们正在编写的代码,因此他们更有可能犯可能被利用的错误。我们因此进行了一项用户研究(N=58),以评估学生程序员在LMs指导下编写的代码的安全性。我们研究中有一半的学生得到了LMM的帮助,而另一半学生没有这样做。我们要求学生在C里写代码代码的代码,在单项链接列表上运行的代码,包括诸如插入、更新、删除、合并等等的操作。我们的研究表明,帮助LM公司添加代码的学生更有可能写出功能代码。虽然在特定个人功能中观察到安全影响,在统计上没有重大影响,但SIM码对Sylreal规则的系统定位可能影响。我们调查了所有功能之间对SylCreal的重复法源的影响。

0

相关内容

代码（Code）是专知网的一个重要知识资料文档板块，旨在整理收录论文源代码、复现代码，经典工程代码等，便于用户查阅下载使用。

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

OPG诱导破骨细胞凋亡的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

MRP1/ABCC1基因3＇UTR单核苷酸多态性介导miRNA对原发性肝癌多药耐药性的影响

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

B介子衰变与新物理圈图修正

国家自然科学基金

0+阅读 · 2009年12月31日

Ter94在Hedgehog信号转导途径中的作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

Hippo-YAP信号通路在hPCSC分化过程中的作用和机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

Discovering the Hidden Facts of User-Dispatcher Interactions via Text-based Reporting Systems for Community Safety

Arxiv

0+阅读 · 2022年11月9日

Flexible variable selection in the presence of missing data

Arxiv

0+阅读 · 2022年11月8日

Focused Dynamic Slicing for Large Applications using an Abstract Memory-Model

Arxiv

0+阅读 · 2022年11月8日

A local approach to parameter space reduction for regression and classification tasks

Arxiv

0+阅读 · 2022年11月8日

Do Users Write More Insecure Code with AI Assistants?

Arxiv

0+阅读 · 2022年11月7日

Using Set Covering to Generate Databases for Holistic Steganalysis

Arxiv

0+阅读 · 2022年11月7日

Prompter: Utilizing Large Language Model Prompting for a Data Efficient Embodied Instruction Following

Arxiv

0+阅读 · 2022年11月7日

DeepSec: Deciding Equivalence Properties for Security Protocols -- Improved theory and practice

Arxiv

0+阅读 · 2022年11月6日

Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book

Arxiv

0+阅读 · 2022年11月4日

Recent Advances in Large Margin Learning

Arxiv

12+阅读 · 2021年3月25日

VIP会员

文章信息

相关主题

语言模型化

相关VIP内容

计算机科学课程与视频课件合集，Computer Science courses with video lectures

计算机科学课程与视频课件合集，Computer Science courses with video lectures

专知会员服务

37+阅读 · 2022年1月24日

【干货书】机器学习速查手册，135页pdf

【干货书】机器学习速查手册，135页pdf

专知会员服务

127+阅读 · 2020年11月20日

NLP必读经典文献100篇

专知会员服务

124+阅读 · 2020年9月8日

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

Keras François Chollet 《Deep Learning with Python 》, 386页pdf

专知会员服务

160+阅读 · 2019年10月12日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

181+阅读 · 2019年10月11日

【哈佛大学商学院课程Fall 2019】机器学习可解释性

【哈佛大学商学院课程Fall 2019】机器学习可解释性

专知会员服务

105+阅读 · 2019年10月9日

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

【SIGGRAPH2019】TensorFlow 2.0深度学习计算机图形学应用

专知会员服务

41+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《人工智能绝不能完全自主》

《人工智能的法律与伦理：军事自主机器独特挑战的深度剖析》316页

从数据到主导：AI与兵棋推演构筑决策优势

《特洛伊木马货柜：武器化集装箱的战略威胁》最新报告

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

AIART 2022 Call for Papers

AIART 2022 Call for Papers

CCF多媒体专委会

1+阅读 · 2022年2月13日

【ICIG2021】Latest News & Announcements of the Tutorial

【ICIG2021】Latest News & Announcements of the Tutorial

中国图象图形学学会CSIG

3+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Workshop

【ICIG2021】Latest News & Announcements of the Workshop

中国图象图形学学会CSIG

0+阅读 · 2021年12月20日

【ICIG2021】Latest News & Announcements of the Plenary Talk2

【ICIG2021】Latest News & Announcements of the Plenary Talk2

中国图象图形学学会CSIG

0+阅读 · 2021年11月2日

【ICIG2021】Latest News & Announcements of the Industry Talk1

【ICIG2021】Latest News & Announcements of the Industry Talk1

中国图象图形学学会CSIG

0+阅读 · 2021年7月28日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

43+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

相关论文

Discovering the Hidden Facts of User-Dispatcher Interactions via Text-based Reporting Systems for Community Safety

Arxiv

0+阅读 · 2022年11月9日

Flexible variable selection in the presence of missing data

Arxiv

0+阅读 · 2022年11月8日

Focused Dynamic Slicing for Large Applications using an Abstract Memory-Model

Arxiv

0+阅读 · 2022年11月8日

A local approach to parameter space reduction for regression and classification tasks

Arxiv

0+阅读 · 2022年11月8日

Do Users Write More Insecure Code with AI Assistants?

Arxiv

0+阅读 · 2022年11月7日

Using Set Covering to Generate Databases for Holistic Steganalysis

Arxiv

0+阅读 · 2022年11月7日

Prompter: Utilizing Large Language Model Prompting for a Data Efficient Embodied Instruction Following

Arxiv

0+阅读 · 2022年11月7日

DeepSec: Deciding Equivalence Properties for Security Protocols -- Improved theory and practice

Arxiv

0+阅读 · 2022年11月6日

Experiences from Using Code Explanations Generated by Large Language Models in a Web Software Development E-Book

Arxiv

0+阅读 · 2022年11月4日

Recent Advances in Large Margin Learning

Arxiv

12+阅读 · 2021年3月25日

相关基金

有氧运动通过LncRNAs调控miR-492/resistin表达改善主动脉内皮胰岛素抵抗的机制研究

国家自然科学基金

0+阅读 · 2015年12月31日

Poisson流形上的修正Hamilton方法

国家自然科学基金

0+阅读 · 2014年12月31日

长链非编码RNA CAR intergenic 10在细胞衰老中的作用和机制

国家自然科学基金

1+阅读 · 2013年12月31日

Calderon问题和边界刚性问题

国家自然科学基金

0+阅读 · 2013年12月31日

OPG诱导破骨细胞凋亡的分子机理

国家自然科学基金

0+阅读 · 2012年12月31日

MRP1/ABCC1基因3＇UTR单核苷酸多态性介导miRNA对原发性肝癌多药耐药性的影响

国家自然科学基金

0+阅读 · 2012年12月31日

基于list-mode数据的快速SART真3D PET断层重建算法的研究

国家自然科学基金

0+阅读 · 2011年12月31日

B介子衰变与新物理圈图修正

国家自然科学基金

0+阅读 · 2009年12月31日

Ter94在Hedgehog信号转导途径中的作用机理

国家自然科学基金

0+阅读 · 2009年12月31日

Hippo-YAP信号通路在hPCSC分化过程中的作用和机制研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员