The CMAP (cultural mapping and pattern analysis) visualization toolkit introduced in this paper is an open-source suite for analyzing and visualizing text data - from qualitative fieldnotes and in-depth interview transcripts to historical documents and web-scaped data like message board posts or blogs. The toolkit is designed for scholars integrating pattern analysis, data visualization, and explanation in qualitative and/or computational social science (CSS). Despite the existence of off-the-shelf commercial qualitative data analysis software, there is a dearth of highly scalable open source options that can work with large data sets, and allow advanced statistical and language modeling. The foundation of the toolkit is a pragmatic approach that aligns research tools with social science project goals- empirical explanation, theory-guided measurement, comparative design, or evidence-based recommendations- guided by the principle that research paradigm and questions should determine methods. Consequently, the CMAP visualization toolkit offers a range of possibilities through the adjustment of relatively small number of parameters, and allows integration with other python tools.
翻译:本文介绍的CMAP(文化映射与模式分析)可视化工具包是一套用于分析与可视化文本数据的开源工具集,其处理范围涵盖质性田野笔记、深度访谈转录文本、历史文献,以及网络抓取数据(如论坛帖子或博客)。该工具包专为在质性和/或计算社会科学中整合模式分析、数据可视化与解释性研究的学者设计。尽管市面上存在现成的商业质性数据分析软件,但能够处理大规模数据集并支持高级统计与语言建模的高度可扩展开源解决方案仍显不足。本工具包以实用主义方法论为基础,遵循"研究范式与问题应决定研究方法"的原则,使研究工具与社会科学研究目标——实证解释、理论指导的测量、比较设计或循证建议——相协调。因此,CMAP可视化工具包通过调整相对较少的参数即可实现多种分析可能,并支持与其他Python工具的集成。