Abstract
Differential privacy (DP) provides a framework for provable privacy protection against arbitrary adversaries, while allowing the release of summary statistics and synthetic data. We address the problem of releasing a noisy real-valued statistic vector T, a function of sensitive data under DP, via the class of K-norm mechanisms with the goal of minimizing the noise added to achieve privacy. First, we introduce the sensitivity space of T, which extends the concepts of sensitivity polytope and sensitivity hull to the setting of arbitrary statistics T. We then propose a framework consisting of three methods for comparing the K-norm mechanisms: (1) a multivariate extension of stochastic dominance, (2) the entropy of the mechanism, and (3) the conditional variance given a direction, to identify the optimal K-norm mechanism. In all of these criteria, the optimal K-norm mechanism is generated by the convex hull of the sensitivity space. Using our methodology, we extend the objective perturbation and functional mechanisms and apply these tools to logistic and linear regression, allowing for private releases of statistical results. Via simulations and an application to a housing price dataset, we demonstrate that our proposed methodology offers a substantial improvement in utility for the same level of risk.
Acknowledgments
We thank Matthew Reimherr and Harris Quach for helpful discussions which lead to some of the results in Section 3, and the reviewers for their helpful comments and suggestions which lead to substantial improvements in this article.
Notes
1 In our preliminary simulations on DP regression mechanisms, we found that objective perturbation and functional mechanism were among the top performing algorithms for logistic and linear regression, respectively, especially for moderate sample sizes.
2 We thank a reviewer for emphasizing this point.
3 Not to be confused with the measure theoretical concept, tightness.
4 It may be necessary to rescale and clamp X such as in Zhang et al. (Citation2012) and Lei et al. (Citation2018).
5 It may be necessary to rescale and clamp X and Y such as in Zhang et al. (Citation2012) and Lei et al. (Citation2018).