The General Data Protection Regulation (GDPR) provides new rights and protections to European people concerning their personal data. We analyze GDPR from a systems perspective, translating its legal articles into a set of capabilities and characteristics that compliant systems must support. Our analysis reveals the phenomenon of metadata explosion, wherein large quantities of metadata needs to be stored along with the personal data to satisfy the GDPR requirements. Our analysis also helps us identify new workloads that must be supported under GDPR. We design and implement an open-source benchmark called GDPRbench that consists of workloads and metrics needed to understand and assess personal-data processing database systems. To gauge the readiness of modern database systems for GDPR, we follow best practices and developer recommendations to modify Redis, PostgreSQL, and a commercial database system to be GDPR compliant. Our experiments demonstrate that the resulting GDPR compliant systems achieve poor performance on GPDR workloads, and that performance scales poorly as the volume of personal data increases. We discuss the real-world implications of these findings, and identify research challenges towards making GDPR compliance efficient in production environments. We release all of our software artifacts and datasets at http://www.gdprbench.org
翻译:《数据保护总条例》为欧洲人民提供了个人数据方面的新权利和保护。我们从系统角度分析《数据保护总条例》,将其法律条款转化为一套符合系统必须支持的能力和特点。我们的分析揭示了元数据爆炸现象,需要大量元数据与个人数据一起储存,以满足《数据保护总条例》的要求。我们的分析还帮助我们确定在《数据保护总条例》下必须支持的新工作量。我们设计和实施了一个公开源基准,称为《数据保护总条例》,其中包括理解和评估个人数据处理数据库系统所需的工作量和衡量标准。为了衡量现代数据库系统对《数据管理总条例》的准备情况,我们遵循最佳做法和开发者的建议,以修改Redis、PostgreSQL和商业数据库系统,使之符合《数据总协定》的要求。我们的实验表明,由此形成的《数据保护总协定》的系统在GPDR工作量上表现不佳,而且随着个人数据量的增加,业绩尺度也很差。我们讨论了这些调查结果的实际影响,并查明在提高生产环境中遵守《数据管理效率方面所面临的研究挑战。我们在http://www.org.gsssetsetsetds。