Many data analysis operations can be expressed as a GROUP BY query on an unbounded set of partitions, followed by a per-partition aggregation. To make such a query differentially private, adding noise to each aggregation is not enough: we also need to make sure that the set of partitions released is also differentially private. This problem is not new, and it was recently formally introduced as differentially private set union. In this work, we continue this area of study, and focus on the common setting where each user is associated with a single partition. In this setting, we propose a simple, optimal differentially private mechanism that maximizes the number of released partitions. We discuss implementation considerations, as well as the possible extension of this approach to the setting where each user contributes to a fixed, small number of partitions.
翻译:许多数据分析操作可以作为一个小组,通过对一组无限制的分区进行查询来表达,然后是按每个分区进行汇总。要进行这种有区别的私下查询,在每一个合计中增加噪音是不够的:我们还需要确保所释放的分区也以有区别的私人方式进行。这个问题不是新问题,而是最近作为有区别的私人组合正式引入的。在这项工作中,我们继续这一研究领域,并侧重于每个用户与单一分区相联系的共同环境。在这个环境中,我们提议一个简单、最佳的有区别的私人机制,使所释放的分区的数目最大化。我们讨论执行问题,以及这一方法可能扩大到每个用户为固定的、少量分区作出贡献的地点。