Data science plays a critical role in transforming complex data into actionable insights across numerous domains. Recent developments in large language models (LLMs) have significantly automated data science workflows, but a fundamental question persists: Can these agentic AI systems truly match the performance of human data scientists who routinely leverage domain-specific knowledge? We explore this question by designing a prediction task where a crucial latent variable is hidden in relevant image data instead of tabular features. As a result, agentic AI that generates generic codes for modeling tabular data cannot perform well, while human experts could identify the important hidden variable using domain knowledge. We demonstrate this idea with a synthetic dataset for property insurance. Our experiments show that agentic AI that relies on generic analytics workflow falls short of methods that use domain-specific insights. This highlights a key limitation of the current agentic AI for data science and underscores the need for future research to develop agentic AI systems that can better recognize and incorporate domain knowledge.
翻译:数据科学在将复杂数据转化为跨众多领域的可操作见解方面发挥着关键作用。大型语言模型(LLM)的最新发展已显著自动化了数据科学工作流程,但一个根本问题依然存在:这些自主AI系统能否真正媲美那些常规利用领域特定知识的人类数据科学家?我们通过设计一个预测任务来探讨这个问题,其中关键的潜变量隐藏在相关的图像数据中,而非表格特征中。因此,生成用于表格数据建模的通用代码的自主AI无法表现良好,而人类专家可以利用领域知识识别出重要的隐藏变量。我们通过一个财产保险的合成数据集来验证这一观点。我们的实验表明,依赖通用分析工作流程的自主AI表现不及利用领域特定见解的方法。这突显了当前用于数据科学的自主AI的一个关键局限,并强调了未来研究需要开发能够更好识别和整合领域知识的自主AI系统。