大语言模型时代的安全分析：利用ChatGPT进行STPA案例研究 (Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT) - 专知论文

会员服务 ·

0

安全分析 · ChatGPT · 交互 · 分析 · 语言模型 ·

2023 年 4 月 3 日

Safety Analysis in the Era of Large Language Models: A Case Study of STPA using ChatGPT

翻译：大语言模型时代的安全分析：利用ChatGPT进行STPA案例研究

Yi Qi,Xingyu Zhao,Xiaowei Huang

from arxiv, Under Review

Large Language Models (LLMs), such as ChatGPT and BERT, are leading a new AI heatwave due to its human-like conversations with detailed and articulate answers across many domains of knowledge. While LLMs are being quickly applied to many AI application domains, we are interested in the following question: Can safety analysis for safety-critical systems make use of LLMs? To answer, we conduct a case study of Systems Theoretic Process Analysis (STPA) on Automatic Emergency Brake (AEB) systems using ChatGPT. STPA, one of the most prevalent techniques for hazard analysis, is known to have limitations such as high complexity and subjectivity, which this paper aims to explore the use of ChatGPT to address. Specifically, three ways of incorporating ChatGPT into STPA are investigated by considering its interaction with human experts: one-off simplex interaction, recurring simplex interaction, and recurring duplex interaction. Comparative results reveal that: (i) using ChatGPT without human experts' intervention can be inadequate due to reliability and accuracy issues of LLMs; (ii) more interactions between ChatGPT and human experts may yield better results; and (iii) using ChatGPT in STPA with extra care can outperform human safety experts alone, as demonstrated by reusing an existing comparison method with baselines. In addition to making the first attempt to apply LLMs in safety analysis, this paper also identifies key challenges (e.g., trustworthiness concern of LLMs, the need of standardisation) for future research in this direction.

翻译：大语言模型（LLMs），例如ChatGPT和BERT，由于其在许多知识领域中进行详细且清晰的人类对话回答而引领了新的AI热潮。虽然LLMs正在快速应用于许多AI应用领域，但我们对以下问题感兴趣：对于安全关键系统的安全分析是否可以利用LLMs？为了回答这个问题，本文使用ChatGPT对自动紧急制动（AEB）系统进行系统理论过程分析（STPA）进行案例研究。STPA是最常用的危险分析技术之一，已知存在高复杂性和主观性的限制，本文旨在探讨使用ChatGPT来解决这些问题。具体而言，通过考虑ChatGPT与人类专家的交互，研究了三种将ChatGPT整合到STPA中的方法：一次单工交互、重复单工交互和重复双工交互。比较结果表明：（i）仅使用ChatGPT而不涉及人类专家的干预可能不足，因为LLMs的可靠性和准确性问题；（ii）ChatGPT和人类专家之间的更多交互可能会产生更好的结果；（iii）在对ChatGPT的使用上需要更加谨慎，可以超越单独的人类安全专家，如再利用现有比较方法和基线所示。除了首次尝试将LLMs应用于安全分析之外，本文还确定了未来研究在这个方向上面临的关键挑战（例如LLMs的可信度问题，标准化需要等）。

3

相关内容

安全分析

从ChatGPT看AI未来趋势和挑战 | 万字长文

从ChatGPT看AI未来趋势和挑战 | 万字长文

专知会员服务

174+阅读 · 2023年4月18日

【ChatGPT系列报告】斯坦福HAT《生成式人工智能》报告，李飞飞、Percy Liang等大牛评述GAI

【ChatGPT系列报告】斯坦福HAT《生成式人工智能》报告，李飞飞、Percy Liang等大牛评述GAI

专知会员服务

134+阅读 · 2023年3月15日

【干货书】使用Python的文本分析蓝图，Blueprints for Text Analytics Using Python

【干货书】使用Python的文本分析蓝图，Blueprints for Text Analytics Using Python

专知会员服务

31+阅读 · 2022年5月29日

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

专知会员服务

68+阅读 · 2022年3月20日

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

专知会员服务

43+阅读 · 2020年11月22日

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

专知会员服务

117+阅读 · 2020年4月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

BERT到底如何work的？A Primer in BERTology: What we know about how BERT works

BERT到底如何work的？A Primer in BERTology: What we know about how BERT works

专知会员服务

50+阅读 · 2020年2月28日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Gartner 报告：人工智能的现状与未来

Gartner 报告：人工智能的现状与未来

InfoQ

14+阅读 · 2019年11月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI界的State of the Art都在这里了

AI界的State of the Art都在这里了

机器之心

12+阅读 · 2018年12月10日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

函数空间、几何和Mahler测度

国家自然科学基金

0+阅读 · 2014年12月31日

基于UGC的应急响应决策支持系统关键技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

大数据错误检测与修复关键技术的研究

国家自然科学基金

2+阅读 · 2014年12月31日

场论中偏微分方程的涡旋解

国家自然科学基金

0+阅读 · 2014年12月31日

面向动态事故风险的快速路速度引导模型预测控制

国家自然科学基金

0+阅读 · 2013年12月31日

基于MRI的前庭系统分割与统计形态学分析关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

技术视角下中国古代木构建筑在朝鲜半岛传承演变谱系研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于订阅匹配树的发布订阅系统高效内容匹配的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

数据驱动的随机系统信号特征信息提取与性能退化建模关键技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

出生缺陷危险因素风险评估"预筛查"工具的研制及其实证研究

国家自然科学基金

1+阅读 · 2011年12月31日

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks

Arxiv

0+阅读 · 2023年5月22日

Distilling ChatGPT for Explainable Automated Student Answer Assessment

Arxiv

2+阅读 · 2023年5月22日

The Scope of ChatGPT in Software Engineering: A Thorough Investigation

Arxiv

0+阅读 · 2023年5月20日

Comparing Software Developers with ChatGPT: An Empirical Investigation

Arxiv

0+阅读 · 2023年5月19日

Towards Human-AI Collaborative Urban Science Research Enabled by Pre-trained Large Language Models

Arxiv

0+阅读 · 2023年5月19日

Generalized Planning in PDDL Domains with Pretrained Large Language Models

Arxiv

0+阅读 · 2023年5月18日

ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models

Arxiv

62+阅读 · 2023年3月29日

A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?

Arxiv

84+阅读 · 2023年3月21日

Text Classification Algorithms: A Survey

Arxiv

15+阅读 · 2019年6月25日

VIP会员

文章信息

相关主题

相关VIP内容

从ChatGPT看AI未来趋势和挑战 | 万字长文

从ChatGPT看AI未来趋势和挑战 | 万字长文

专知会员服务

174+阅读 · 2023年4月18日

【ChatGPT系列报告】斯坦福HAT《生成式人工智能》报告，李飞飞、Percy Liang等大牛评述GAI

【ChatGPT系列报告】斯坦福HAT《生成式人工智能》报告，李飞飞、Percy Liang等大牛评述GAI

专知会员服务

134+阅读 · 2023年3月15日

【干货书】使用Python的文本分析蓝图，Blueprints for Text Analytics Using Python

【干货书】使用Python的文本分析蓝图，Blueprints for Text Analytics Using Python

专知会员服务

31+阅读 · 2022年5月29日

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

【Meta AI】多模态理解研究进展，Advances in multimodal understanding research at Meta AI

专知会员服务

68+阅读 · 2022年3月20日

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

【知识图谱@EMNLP2020】Knowledge Graphs in NLP @ EMNLP 2020

专知会员服务

43+阅读 · 2020年11月22日

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

对话推荐系统综述论文，35页pdf，A Survey on Conversational Recommender Systems

专知会员服务

117+阅读 · 2020年4月3日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

BERT到底如何work的？A Primer in BERTology: What we know about how BERT works

BERT到底如何work的？A Primer in BERTology: What we know about how BERT works

专知会员服务

50+阅读 · 2020年2月28日

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

MIT-深度学习Deep Learning State of the Art in 2020，87页ppt

专知会员服务

62+阅读 · 2020年2月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

热门VIP内容

开通专知VIP会员享更多权益服务

【NTU博士论文】利用强化学习与生成模型推进可靠且可泛化的决策

美海军研发“增强侦察与态势评估系统（ARES）”应用程序以优化作战规划（附研究论文）

【NeurIPS2025】DNA-DetectLLM：基于 DNA 启发的“突变-修复”范式揭示 AI 生成文本

面向深度研究系统的强化学习基础：综述

相关资讯

VCIP 2022 Call for Demos

VCIP 2022 Call for Demos

CCF多媒体专委会

1+阅读 · 2022年6月6日

Gartner 报告：人工智能的现状与未来

Gartner 报告：人工智能的现状与未来

InfoQ

14+阅读 · 2019年11月29日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

NLP 2018 Highlights：2018自然语言处理技术亮点汇总

AINLP

10+阅读 · 2019年2月9日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

AI界的State of the Art都在这里了

AI界的State of the Art都在这里了

机器之心

12+阅读 · 2018年12月10日

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

【论文推荐】最新七篇强化学习相关论文—逻辑约束、综述、多任务深度强化学习、参数服务器、事件抽取、分层强化学习、过拟合研究

专知

25+阅读 · 2018年4月29日

【推荐】RNN/LSTM时序预测

【推荐】RNN/LSTM时序预测

机器学习研究会

25+阅读 · 2017年9月8日

相关论文

Rethinking the Evaluation for Conversational Recommendation in the Era of Large Language Models

Arxiv

1+阅读 · 2023年5月22日

Search-in-the-Chain: Towards Accurate, Credible and Traceable Large Language Models for Knowledge-intensive Tasks

Arxiv

0+阅读 · 2023年5月22日

Distilling ChatGPT for Explainable Automated Student Answer Assessment

Arxiv

2+阅读 · 2023年5月22日

The Scope of ChatGPT in Software Engineering: A Thorough Investigation

Arxiv

0+阅读 · 2023年5月20日

Comparing Software Developers with ChatGPT: An Empirical Investigation

Arxiv

0+阅读 · 2023年5月19日

Towards Human-AI Collaborative Urban Science Research Enabled by Pre-trained Large Language Models

Arxiv

0+阅读 · 2023年5月19日

Generalized Planning in PDDL Domains with Pretrained Large Language Models

Arxiv

0+阅读 · 2023年5月18日

ChatGPT is a Knowledgeable but Inexperienced Solver: An Investigation of Commonsense Problem in Large Language Models

Arxiv

62+阅读 · 2023年3月29日

A Complete Survey on Generative AI (AIGC): Is ChatGPT from GPT-4 to GPT-5 All You Need?

Arxiv

84+阅读 · 2023年3月21日

Text Classification Algorithms: A Survey

Arxiv

15+阅读 · 2019年6月25日

相关基金

函数空间、几何和Mahler测度

国家自然科学基金

0+阅读 · 2014年12月31日

基于UGC的应急响应决策支持系统关键技术研究

国家自然科学基金

12+阅读 · 2014年12月31日

大数据错误检测与修复关键技术的研究

国家自然科学基金

2+阅读 · 2014年12月31日

场论中偏微分方程的涡旋解

国家自然科学基金

0+阅读 · 2014年12月31日

面向动态事故风险的快速路速度引导模型预测控制

国家自然科学基金

0+阅读 · 2013年12月31日

基于MRI的前庭系统分割与统计形态学分析关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

技术视角下中国古代木构建筑在朝鲜半岛传承演变谱系研究

国家自然科学基金

0+阅读 · 2012年12月31日

基于订阅匹配树的发布订阅系统高效内容匹配的关键技术研究

国家自然科学基金

0+阅读 · 2012年12月31日

数据驱动的随机系统信号特征信息提取与性能退化建模关键技术研究

国家自然科学基金

2+阅读 · 2012年12月31日

出生缺陷危险因素风险评估"预筛查"工具的研制及其实证研究

国家自然科学基金

1+阅读 · 2011年12月31日

微信扫码咨询专知VIP会员