LLM app (tool) ecosystems are rapidly evolving to support sophisticated use cases that often require extensive user data collection. Given that LLM apps are developed by third parties and anecdotal evidence indicating inconsistent enforcement of policies by LLM platforms, sharing user data with these apps presents significant privacy risks. In this paper, we aim to bring transparency in data practices of LLM app ecosystems. We examine OpenAI's GPT app ecosystem as a case study. We propose an LLM-based framework to analyze the natural language specifications of GPT Actions (custom tools) and assess their data collection practices. Our analysis reveals that Actions collect excessive data across 24 categories and 145 data types, with third-party Actions collecting 6.03% more data on average. We find that several Actions violate OpenAI's policies by collecting sensitive information, such as passwords, which is explicitly prohibited by OpenAI. Lastly, we develop an LLM-based privacy policy analysis framework to automatically check the consistency of data collection by Actions with disclosures in their privacy policies. Our measurements indicate that the disclosures for most of the collected data types are omitted, with only 5.8% of Actions clearly disclosing their data collection practices.
翻译:LLM应用(工具)生态系统正在快速发展,以支持通常需要大量用户数据收集的复杂用例。鉴于LLM应用由第三方开发,且坊间证据表明LLM平台的政策执行存在不一致性,向这些应用共享用户数据带来了重大的隐私风险。本文旨在揭示LLM应用生态系统的数据实践透明度。我们以OpenAI的GPT应用生态系统作为案例进行研究。我们提出了一个基于LLM的框架,用于分析GPT Actions(自定义工具)的自然语言规范并评估其数据收集实践。我们的分析表明,Actions在24个类别和145种数据类型上收集了过量数据,其中第三方Actions平均多收集6.03%的数据。我们发现多个Actions违反了OpenAI的政策,收集了如密码等OpenAI明确禁止的敏感信息。最后,我们开发了一个基于LLM的隐私政策分析框架,用于自动检查Actions的数据收集行为与其隐私政策中披露内容的一致性。我们的测量结果显示,大多数已收集数据类型的披露信息被遗漏,仅有5.8%的Actions明确披露了其数据收集实践。