In this article, we revisit the well-studied problem of mean estimation under user-level $\varepsilon$-differential privacy (DP). While user-level $\varepsilon$-DP mechanisms for mean estimation, which typically bound (or clip) user contributions to reduce sensitivity, are well-known, an analysis of their estimation errors usually assumes that the data samples are independent and identically distributed (i.i.d.), and sometimes also that all participating users contribute the same number of samples (data homogeneity). Our main result is a precise characterization of the \emph{worst-case} estimation error under general clipping strategies, for heterogeneous data, and as a by-product, the clipping strategy that gives rise to the smallest worst-case error. Interestingly, we show via experimental studies that even for i.i.d. samples, our clipping strategy performs uniformly better that the well-known clipping strategy of Amin et al. (2019), which involves additional, private parameter estimation.
翻译:暂无翻译