# Private Identity Testing for High-Dimensional Distributions

arXiv: Data Structures and Algorithms, 2019.

EI
Keywords:
discrete distributionhigh dimensionalprivate datum analysisproduct distributionidentity testingMore(10+)
Weibo:
There is a linear-time, ε-differentially private tester A that distinguishes the uniform distribution over {±1}d from any product distribution over {±1}d that is α-far in total variation distance using n = n(d, α, ε) samples for n= O

Abstract:

In this work we present novel differentially private identity (goodness-of-fit) testers for natural and widely studied classes of multivariate product distributions: Gaussians in $\mathbb{R}^d$ with known covariance and product distributions over $\{\pm 1\}^{d}$. Our testers have improved sample complexity compared to those derived from...More

Code:

Data:

Introduction
• A foundation of statistical inference is hypothesis testing: given two disjoint sets of probability distributions H0 and H1, the authors want to design an algorithm T that takes a random sample X from some distribution P ∈ H0 ∪ H1 and, with high probability, determines whether P is in H0 or H1.
• There is an exponential-time, ε-differentially private tester A that distinguishes the uniform distribution over {±1}d from any product distribution over {±1}d that is α-far in total variation distance using n = n(d, α, ε) samples for n= O
Highlights
• A foundation of statistical inference is hypothesis testing: given two disjoint sets of probability distributions H0 and H1, we want to design an algorithm T that takes a random sample X from some distribution P ∈ H0 ∪ H1 and, with high probability, determines whether P is in H0 or H1
• There is a linear-time, ε-differentially private tester A that distinguishes the uniform distribution over {±1}d from any product distribution over {±1}d that is α-far in total variation distance using n = n(d, α, ε) samples for n= O
• One might even conjecture that this sample complexity is optimal by analogy with the case of privately estimating a product distribution over {±1}d or a Gaussian in Rd with known covariance, for which the sample complexity is Θ (d/α2 + d/αε) in both cases [KLSU19]
• A product distribution is extreme if each of its marginals is O(1/d)-close to constant. For this restricted class of Boolean product distributions, we provide a two-way reduction argument showing that identity testing is equivalent to the identity testing in the univariate setting
• For hypothesis tests with constant error probabilities, sample complexity bounds for differential privacy are equivalent, up to constant factors, to sample complexity bounds for other notions of distributional algorithmic stability, such as (ε, δ)-differentially private (DP) [DKM+06], concentrated DP [DR16, BS16], KL- and TV-stability [WLF16, BNS+16]
Results
• In view of the above, one way to find a better private tester would be to identify an alternative statistic for testing uniformity of product distributions with lower global sensitivity.
• This algorithm has two main components: a Lipschitz extension that allows them to control the amount of noise added to the test statistic, and an iterative step that rejects a successively larger class of distributions.
• If the following four conditions hold: (i) X is drawn from a product distribution, (ii) X satisfies (11), (iii) X ∈ C(∆), and (iv) LipschitzExtensionTest(X, ε, ∆, β) returns accept, X ∈ C(∆′) with probability at least 1 − β.
• 3. The number of rounds M is sufficient to guarantee that the sensitivity and the amount of noise added to T(X) in the last test in line 12 is small enough that one distinguishes between the two hypotheses with the desired sample complexity.
• The authors have, by Lemma 4.2 and Chebyshev’s inequality, and recalling that T(X) = T (X), the authors can bound the probability that Algorithm 2 rejects in line 12 as n(n − 1)α2 4
• For hypothesis tests with constant error probabilities, sample complexity bounds for differential privacy are equivalent, up to constant factors, to sample complexity bounds for other notions of distributional algorithmic stability, such as (ε, δ)-DP [DKM+06], concentrated DP [DR16, BS16], KL- and TV-stability [WLF16, BNS+16].
• Algorithm 3 is (4ε, 13δ)-differentially private and distinguishes between the cases P = Ud versus P − Ud 1 ≥ α with probability at least 2/3, having sample complexity n = Od1/2 α2
Conclusion
• Suppose there exists an algorithm which takes n samples from an unknown product distribution P ′ over {±1}d and can distinguish between the following two cases with probability at least 2/3: (U1) P ′ = Ud, (U2) P ′ −Ud 1 ≥ cα.
• There exists an algorithm which takes n samples from an unknown product distribution P over {±1}d and can distinguish between the following two cases with probability at least 2/3: (B1) P = Q, (B2) P − Q 1 ≥ α.
Summary
• A foundation of statistical inference is hypothesis testing: given two disjoint sets of probability distributions H0 and H1, the authors want to design an algorithm T that takes a random sample X from some distribution P ∈ H0 ∪ H1 and, with high probability, determines whether P is in H0 or H1.
• There is an exponential-time, ε-differentially private tester A that distinguishes the uniform distribution over {±1}d from any product distribution over {±1}d that is α-far in total variation distance using n = n(d, α, ε) samples for n= O
• In view of the above, one way to find a better private tester would be to identify an alternative statistic for testing uniformity of product distributions with lower global sensitivity.
• This algorithm has two main components: a Lipschitz extension that allows them to control the amount of noise added to the test statistic, and an iterative step that rejects a successively larger class of distributions.
• If the following four conditions hold: (i) X is drawn from a product distribution, (ii) X satisfies (11), (iii) X ∈ C(∆), and (iv) LipschitzExtensionTest(X, ε, ∆, β) returns accept, X ∈ C(∆′) with probability at least 1 − β.
• 3. The number of rounds M is sufficient to guarantee that the sensitivity and the amount of noise added to T(X) in the last test in line 12 is small enough that one distinguishes between the two hypotheses with the desired sample complexity.
• The authors have, by Lemma 4.2 and Chebyshev’s inequality, and recalling that T(X) = T (X), the authors can bound the probability that Algorithm 2 rejects in line 12 as n(n − 1)α2 4
• For hypothesis tests with constant error probabilities, sample complexity bounds for differential privacy are equivalent, up to constant factors, to sample complexity bounds for other notions of distributional algorithmic stability, such as (ε, δ)-DP [DKM+06], concentrated DP [DR16, BS16], KL- and TV-stability [WLF16, BNS+16].
• Algorithm 3 is (4ε, 13δ)-differentially private and distinguishes between the cases P = Ud versus P − Ud 1 ≥ α with probability at least 2/3, having sample complexity n = Od1/2 α2
• Suppose there exists an algorithm which takes n samples from an unknown product distribution P ′ over {±1}d and can distinguish between the following two cases with probability at least 2/3: (U1) P ′ = Ud, (U2) P ′ −Ud 1 ≥ cα.
• There exists an algorithm which takes n samples from an unknown product distribution P over {±1}d and can distinguish between the following two cases with probability at least 2/3: (B1) P = Q, (B2) P − Q 1 ≥ α.
Related work
• Over the last couple decades, there has been significant work on hypothesis testing with a focus on minimax rates. The starting point in the statistics community could be considered the work of Ingster and coauthors [Ing94, Ing97, IS03]. Within theoretical computer science, study on hypothesis testing arose as a subfield of property testing [GGR96, GR00]. Work by Batu et al [BFR+00, BFF+01] formalized several of the commonly studied problems, including testing of uniformity, identity, closeness, and independence. Other representative works in this line include [BKR04, Pan08, Val11, CDVV14, VV14, ADK15, BV15, DKN15, CDGR16, DK16, Gol16, BCG17, DKW18]. Some works on testing in the multivariate setting include testing of independence [BFF+01, AAK+07, RX14, LRR13, ADK15, DK16, CDKS18], and testing on graphical models [CDKS17, DP17, DDK18, GLP18, ABDK18, BBC+19]. We note that graphical models (both Ising models and Bayesian Networks) include the product distribution case we study in this paper. Surveys and more thorough coverage of related work on minimax hypothesis testing include [Rub12, Can15, Gol17, BW18, Kam18].
Funding
• GK was supported as a Microsoft Research Fellow, as part of the Simons-Berkeley Research Fellowship program
• AM was supported by NSF grant CCF-1763786, a Sloan Foundation Research Award, and a postdoctoral fellowship from BU’s Hariri Institute for Computing
• JU and LZ were supported by NSF grants CCF-1718088, CCF-1750640, and CNS-1816028
• CC was supported by a Goldstine Fellowship
Reference
• Noga Alon, Alexandr Andoni, Tali Kaufman, Kevin Matulef, Ronitt Rubinfeld, and Ning Xie. Testing k-wise and almost k-wise independence. In Proceedings of the 39th Annual ACM Symposium on the Theory of Computing, STOC ’07, pages 496–505, New York, NY, USA, 2007. ACM.
• Jayadev Acharya, Arnab Bhattacharyya, Constantinos Daskalakis, and Saravanan Kandasamy. Learning and testing causal models with interventions. In Advances in Neural Information Processing Systems 31, NeurIPS ’18. Curran Associates, Inc., 2018.
• Jayadev Acharya, Clement L. Canonne, Cody Freitag, and Himanshu Tyagi. Test without trust: Optimal locally private distribution testing. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, AISTATS ’19, pages 2067–2076. JMLR, Inc., 2019.
• Jayadev Acharya, Constantinos Daskalakis, and Gautam Kamath. Optimal testing for properties of distributions. In Advances in Neural Information Processing Systems 28, NIPS ’15, pages 3577–3598. Curran Associates, Inc., 2015.
• Maryam Aliakbarpour, Ilias Diakonikolas, and Ronitt Rubinfeld. Differentially private identity and closeness testing of discrete distributions. In Proceedings of the 35th International Conference on Machine Learning, ICML ’18, pages 169–178. JMLR, Inc., 2018.
• [AKSZ18] Jayadev Acharya, Gautam Kamath, Ziteng Sun, and Huanyu Zhang. Inspectre: Privately estimating the unseen. In Proceedings of the 35th International Conference on Machine Learning, ICML ’18, pages 30–39. JMLR, Inc., 2018.
• Jordan Awan and Aleksandra Slavkovic. Differentially private uniformly most powerful tests for binomial data. In Advances in Neural Information Processing Systems 31, NeurIPS ’18, pages 4208–4218. Curran Associates, Inc., 2018.
• Jayadev Acharya, Ziteng Sun, and Huanyu Zhang. Differentially private testing of identity and closeness of discrete distributions. In Advances in Neural Information Processing Systems 31, NeurIPS ’18, pages 6878–6891. Curran Associates, Inc., 2018.
• Jayadev Acharya, Ziteng Sun, and Huanyu Zhang. Hadamard response: Estimating distributions privately, efficiently, and with little communication. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, AISTATS ’19, pages 1120–112JMLR, Inc., 2019.
• Ivona Bezakova, Antonio Blanca, Zongchen Chen, Daniel Stefankovic, and Eric Vigoda. Lower bounds for testing graphical models: Colorings and antiferromagnetic Ising models. In Proceedings of the 32nd Annual Conference on Learning Theory, COLT ’19, pages 283–298, 2019.
• Jeremiah Blocki, Avrim Blum, Anupam Datta, and Or Sheffet. Differentially private data analysis of social networks via restricted sensitivity. In Proceedings of the 4th Conference on Innovations in Theoretical Computer Science, ITCS ’13, pages 87–96, New York, NY, USA, 2013. ACM.
• Eric Blais, Clement L. Canonne, and Tom Gur. Distribution testing lower bounds via reductions from communication complexity. In Proceedings of the 32nd Computational Complexity Conference, CCC ’17, pages 28:1–28:40, Dagstuhl, Germany, 2017. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
• [BCSZ18a] Christian Borgs, Jennifer Chayes, Adam Smith, and Ilias Zadik. Private algorithms can always be extended. arXiv preprint arXiv:1810.12518, 2018.
• [BCSZ18b] Christian Borgs, Jennifer Chayes, Adam Smith, and Ilias Zadik. Revealing network structure, confidentially: Improved rates for node-private graphon estimation. In Proceedings of the 59th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’18, pages 533–543, Washington, DC, USA, 2018. IEEE Computer Society.
• [BDMN05] Avrim Blum, Cynthia Dwork, Frank McSherry, and Kobbi Nissim. Practical privacy: The SuLQ framework. In Proceedings of the 24th ACM SIGMOD-SIGACTSIGART Symposium on Principles of Database Systems, PODS ’05, pages 128–138, New York, NY, USA, 2005. ACM.
• Tugkan Batu, Eldar Fischer, Lance Fortnow, Ravi Kumar, Ronitt Rubinfeld, and Patrick White. Testing random variables for independence and identity. In Proceedings of the 42nd Annual IEEE Symposium on Foundations of Computer Science, FOCS ’01, pages 442–451, Washington, DC, USA, 2001. IEEE Computer Society.
• Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, and Patrick White. Testing that distributions are close. In Proceedings of the 41st Annual IEEE Symposium on Foundations of Computer Science, FOCS ’00, pages 259–269, Washington, DC, USA, 2000. IEEE Computer Society.
• Tugkan Batu, Ravi Kumar, and Ronitt Rubinfeld. Sublinear algorithms for testing monotone and unimodal distributions. In Proceedings of the 36th Annual ACM Symposium on the Theory of Computing, STOC ’04, New York, NY, USA, 2004. ACM.
• Hai Brenner and Kobbi Nissim. Impossibility of differentially private universally optimal mechanisms. SIAM Journal on Computing, 43(5):1513–1540, 2014.
• Raef Bassily, Kobbi Nissim, Adam Smith, Thomas Steinke, Uri Stemmer, and Jonathan Ullman. Algorithmic stability for adaptive data analysis. In Proceedings of the 48th Annual ACM Symposium on the Theory of Computing, STOC ’16, pages 1046–1059, New York, NY, USA, 2016. ACM.
• Mark Bun, Kobbi Nissim, Uri Stemmer, and Salil Vadhan. Differentially private release and learning of threshold functions. In Proceedings of the 56th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’15, pages 634–649, Washington, DC, USA, 2015. IEEE Computer Society.
• Mark Bun and Thomas Steinke. Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Proceedings of the 14th Conference on Theory of Cryptography, TCC ’16-B, pages 635–658, Berlin, Heidelberg, 2016. Springer.
• Mark Bun, Jonathan Ullman, and Salil Vadhan. Fingerprinting codes and the price of approximate differential privacy. In Proceedings of the 46th Annual ACM Symposium on the Theory of Computing, STOC ’14, pages 1–10, New York, NY, USA, 2014. ACM.
• Bhaswar Bhattacharya and Gregory Valiant. Testing closeness with unequal sized samples. In Advances in Neural Information Processing Systems 28, NIPS ’15, pages 2611–2619. Curran Associates, Inc., 2015.
• Sivaraman Balakrishnan and Larry Wasserman. Hypothesis testing for highdimensional multinomials: A selective review. The Annals of Applied Statistics, 12(2):727–749, 2018.
• Clement L. Canonne. A survey on distribution testing: Your data is big. but is it blue? Electronic Colloquium on Computational Complexity (ECCC), 22(63), 2015.
• http://www.cs.columbia.edu/~ccanonne/files/misc/2017-poissonconcentration.pdf, 2017.
• Zachary Campbell, Andrew Bray, Anna Ritz, and Adam Groce. Differentially private ANOVA testing. In Proceedings of the 2018 International Conference on Data Intelligence and Security, ICDIS ’18, pages 281–285, Washington, DC, USA, 2018. IEEE Computer Society.
• Rachel Cummings and David Durfee. Individual sensitivity preprocessing for data privacy. In Proceedings of the 31st Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’20, Philadelphia, PA, USA, 2020. SIAM.
• Clement L. Canonne, Ilias Diakonikolas, Themis Gouleakis, and Ronitt Rubinfeld. Testing shape restrictions of discrete distributions. In Proceedings of the 33rd Symposium on Theoretical Aspects of Computer Science, STACS ’16, pages 25:1–25:14, Dagstuhl, Germany, 2016. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
• Bryan Cai, Constantinos Daskalakis, and Gautam Kamath. Priv’it: Private and sample efficient identity testing. In Proceedings of the 34th International Conference on Machine Learning, ICML ’17, pages 635–644. JMLR, Inc., 2017.
• [CDKS17] Clement L. Canonne, Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. Testing Bayesian networks. In Proceedings of the 30th Annual Conference on Learning Theory, COLT ’17, pages 370–448, 2017.
• Clement L. Canonne, Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. Testing conditional independence of discrete distributions. In Proceedings of the 50th Annual ACM Symposium on the Theory of Computing, STOC ’18, pages 735– 748, New York, NY, USA, 2018. ACM.
• Siu On Chan, Ilias Diakonikolas, Gregory Valiant, and Paul Valiant. Optimal algorithms for testing closeness of discrete distributions. In Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’14, pages 1193– 1203, Philadelphia, PA, USA, 2014. SIAM.
• [CKM+18] Rachel Cummings, Sara Krehbiel, Yajun Mei, Rui Tuo, and Wanrong Zhang. Differentially private change-point detection. In Advances in Neural Information Processing Systems 31, NeurIPS ’18. Curran Associates, Inc., 2018.
• Clement L. Canonne, Gautam Kamath, Audra McMillan, Adam Smith, and Jonathan Ullman. The structure of optimal private tests for simple hypotheses. In Proceedings of the 51st Annual ACM Symposium on the Theory of Computing, STOC ’19, New York, NY, USA, 2019. ACM.
• Simon Couch, Zeki Kazan, Kaiyan Shi, Andrew Bray, and Adam Groce. Differentially private nonparametric hypothesis testing. In Proceedings of the 2019 ACM Conference on Computer and Communications Security, CCS ’19, New York, NY, USA, 2019. ACM.
• T. Tony Cai, Yichen Wang, and Linjun Zhang. The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy. arXiv preprint arXiv:1902.04495, 2019.
• Constantinos Daskalakis, Nishanth Dikkala, and Gautam Kamath. Testing Ising models. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’18, pages 1989–2007, Philadelphia, PA, USA, 2018. SIAM.
• [DFH+15] Cynthia Dwork, Vitaly Feldman, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Aaron Roth. The reusable holdout: Preserving validity in adaptive data analysis. Science, 349(6248):636–638, 2015.
• Ilias Diakonikolas, Moritz Hardt, and Ludwig Schmidt. Differentially private learning of structured discrete distributions. In Advances in Neural Information Processing Systems 28, NIPS ’15, pages 2566–2574. Curran Associates, Inc., 2015.
• https://machinelearning.apple.com/docs/learning-with-privacy-at-scale/applediffere
• John C. Duchi, Michael I. Jordan, and Martin J. Wainwright. Local privacy and statistical minimax rates. In Proceedings of the 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’13, pages 429–438, Washington, DC, USA, 2013. IEEE Computer Society.
• Ilias Diakonikolas and Daniel M. Kane. A new approach for testing properties of discrete distributions. In Proceedings of the 57th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’16, pages 685–694, Washington, DC, USA, 2016. IEEE Computer Society.
• [DKM+06] Cynthia Dwork, Krishnaram Kenthapadi, Frank McSherry, Ilya Mironov, and Moni Naor. Our data, ourselves: Privacy via distributed noise generation. In Proceedings of the 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, EUROCRYPT ’06, pages 486–503, Berlin, Heidelberg, 2006. Springer.
• Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin. Testing identity of structured distributions. In Proceedings of the 26th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’15, pages 1841–1854, Philadelphia, PA, USA, 2015. SIAM.
• Constantinos Daskalakis, Gautam Kamath, and John Wright. Which distribution distances are sublinearly testable? In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’18, pages 2747–2764, Philadelphia, PA, USA, 2018. SIAM.
• Aref N. Dajani, Amy D. Lauger, Phyllis E. Singer, Daniel Kifer, Jerome P. Reiter, Ashwin Machanavajjhala, Simson L. Garfinkel, Scot A. Dahl, Matthew Graham, Vishesh Karwa, Hang Kim, Philip Lelerc, Ian M. Schmutte, William N. Sexton, Lars Vilhuber, and John M. Abowd. The modernization of statistical disclosure limitation at the U.S. census bureau, 2017. Presented at the September 2017 meeting of the Census Scientific Advisory Committee.
• Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. Calibrating noise to sensitivity in private data analysis. In Proceedings of the 3rd Conference on Theory of Cryptography, TCC ’06, pages 265–284, Berlin, Heidelberg, 2006. Springer.
• [DMR18] Luc Devroye, Abbas Mehrabian, and Tommy Reddad. The total variation distance between high-dimensional Gaussians. arXiv preprint arXiv:1810.08693, 2018.
• Constantinos Daskalakis and Qinxuan Pan. Square Hellinger subadditivity for Bayesian networks and its applications to identity testing. In Proceedings of the 30th Annual Conference on Learning Theory, COLT ’17, pages 697–703, 2017.
• Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy. Foundations and Trends R in Machine Learning, 9(3–4):211–407, 2014.
• Cynthia Dwork and Guy N. Rothblum. Concentrated differential privacy. arXiv preprint arXiv:1603.01887, 2016.
• John C. Duchi and Feng Ruan. The right complexity measure in locally private estimation: It is not the fisher information. arXiv preprint arXiv:1806.05756, 2018.
• Ulfar Erlingsson, Vasyl Pihur, and Aleksandra Korolova. RAPPOR: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM Conference on Computer and Communications Security, CCS ’14, pages 1054–1067, New York, NY, USA, 2014. ACM.
• Oded Goldreich, Shafi Goldwasser, and Dana Ron. Property testing and its connection to learning and approximation. In Proceedings of the 37th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’96, pages 339–348, Washington, DC, USA, 1996. IEEE Computer Society.
• Reza Gheissari, Eyal Lubetzky, and Yuval Peres. Concentration inequalities for polynomials of contracting Ising models. Electronic Communications in Probability, 23(76):1–12, 2018.
• Marco Gaboardi, Hyun-Woo Lim, Ryan M. Rogers, and Salil P. Vadhan. Differentially private chi-squared hypothesis testing: Goodness of fit and independence testing. In Proceedings of the 33rd International Conference on Machine Learning, ICML ’16, pages 1395–1403. JMLR, Inc., 2016.
• Oded Goldreich. The uniform distribution is complete with respect to testing identity to a fixed distribution. Electronic Colloquium on Computational Complexity (ECCC), 23(15), 2016.
• Oded Goldreich. Introduction to Property Testing. Cambridge University Press, 2017.
• Oded Goldreich and Dana Ron. On testing expansion in bounded-degree graphs. Electronic Colloquium on Computational Complexity (ECCC), 7(20), 2000.
• Marco Gaboardi and Ryan Rogers. Local private hypothesis testing: Chi-square tests. In Proceedings of the 35th International Conference on Machine Learning, ICML ’18, pages 1626–1635. JMLR, Inc., 2018.
• Marco Gaboardi, Ryan Rogers, and Or Sheffet. Locally private confidence intervals: Z-test and tight confidence intervals. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, AISTATS ’19, pages 2545–2554. JMLR, Inc., 2019.
• Yuri Izmailovich Ingster. Minimax detection of a signal in lp metrics. Journal of Mathematical Sciences, 68(4):503–515, 1994.
• Yuri Izmailovich Ingster. Adaptive chi-square tests. Zapiski Nauchnykh Seminarov POMI, 244:150–166, 1997.
• Yuri Izmailovich Ingster and Irina A. Suslina. Nonparametric Goodness-of-fit Testing Under Gaussian Models, volume 169 of Lecture Notes in Statistics. Springer, 2003.
• [JKMW19] Matthew Joseph, Janardhan Kulkarni, Jieming Mao, and Zhiwei Steven Wu. Locally private Gaussian estimation. In Advances in Neural Information Processing Systems 32, NeurIPS ’19. Curran Associates, Inc., 2019.
• [Kam18] Gautam Kamath. Modern Challenges in Distribution Testing. PhD thesis, Massachusetts Institute of Technology, September 2018.
• Peter Kairouz, Keith Bonawitz, and Daniel Ramage. Discrete distribution estimation under local privacy. In Proceedings of the 33rd International Conference on Machine Learning, ICML ’16, pages 2436–2444. JMLR, Inc., 2016.
• [KLSU19] Gautam Kamath, Jerry Li, Vikrant Singhal, and Jonathan Ullman. Privately learning high-dimensional distributions. In Proceedings of the 32nd Annual Conference on Learning Theory, COLT ’19, pages 1853–1902, 2019.
• Shiva Prasad Kasiviswanathan, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. Analyzing graphs with node differential privacy. In Proceedings of the 10th Conference on Theory of Cryptography, TCC ’13, pages 457–476, Berlin, Heidelberg, 2013. Springer.
• Daniel Kifer and Ryan M. Rogers. A new class of private chi-square tests. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, AISTATS ’17, pages 991–1000. JMLR, Inc., 2017.
• Kazuya Kakizaki, Jun Sakuma, and Kazuto Fukuchi. Differentially private chisquared test by unit circle mechanism. In Proceedings of the 34th International Conference on Machine Learning, ICML ’17, pages 1761–1770. JMLR, Inc., 2017.
• Vishesh Karwa and Salil Vadhan. Finite sample differentially private confidence intervals. In Proceedings of the 9th Conference on Innovations in Theoretical Computer Science, ITCS ’18, pages 44:1–44:9, Dagstuhl, Germany, 2018. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik.
• [LRR13] Reut Levi, Dana Ron, and Ronitt Rubinfeld. Testing properties of collections of distributions. Theory of Computing, 9(8):295–347, 2013.
• Edward J. McShane. Extension of range of functions. Bulletin of the American Mathematical Society, 40(12):837–842, 1934.
• Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. Smooth sensitivity and sampling in private data analysis. In Proceedings of the 39th Annual ACM Symposium on the Theory of Computing, STOC ’07, pages 75–84, New York, NY, USA, 2007. ACM.
• Liam Paninski. A coincidence-based test for uniformity given very sparsely sampled discrete data. IEEE Transactions on Information Theory, 54(10):4750–4755, 2008.
• Ryan Rogers, Aaron Roth, Adam Smith, and Om Thakkar. Max-information, differential privacy, and post-selection hypothesis testing. In Proceedings of the 57th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’16, pages 487–494, Washington, DC, USA, 2016. IEEE Computer Society.
• Sofya Raskhodnikova and Adam D. Smith. Lipschitz extensions for node-private graph statistics and the generalized exponential mechanism. In Proceedings of the 57th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’16, pages 495–504, Washington, DC, USA, 2016. IEEE Computer Society.
• [Rub12] Ronitt Rubinfeld. Taming big probability distributions. XRDS, 19(1):24–28, 2012.
• Ronitt Rubinfeld and Ning Xie. Testing non-uniform k-wise independent distributions over product spaces. In Proceedings of the 37th International Colloquium on Automata, Languages, and Programming, ICALP ’10, pages 565–581, 2014.
• [SGHG+19] Marika Swanberg, Ira Globus-Harris, Iris Griffith, Anna Ritz, Adam Groce, and Andrew Bray. Improved differentially private analysis of variance. Proceedings on Privacy Enhancing Technologies, 2019(3), 2019.
• Or Sheffet. Locally private hypothesis testing. In Proceedings of the 35th International Conference on Machine Learning, ICML ’18, pages 4605–4614. JMLR, Inc., 2018.
• Adam Smith. Privacy-preserving statistical estimation with optimal convergence rates. In Proceedings of the 43rd Annual ACM Symposium on the Theory of Computing, STOC ’11, pages 813–822, New York, NY, USA, 2011. ACM.
• Thomas Steinke and Jonathan Ullman. Between pure and approximate differential privacy. The Journal of Privacy and Confidentiality, 7(2):3–22, 2017.
• Adam Sealfon and Jonathan Ullman. Efficiently estimating Erdos-Renyi graphs with node differential privacy. In Advances in Neural Information Processing Systems 32, NeurIPS ’19. Curran Associates, Inc., 2019.
• Caroline Uhler, Aleksandra Slavkovic, and Stephen E. Fienberg. Privacy-preserving data sharing for genome-wide association studies. The Journal of Privacy and Confidentiality, 5(1):137–166, 2013.
• Paul Valiant. Testing symmetric properties of distributions. SIAM Journal on Computing, 40(6):1927–1968, 2011.
• Duy Vu and Aleksandra Slavkovic. Differential privacy for clinical trial data: Preliminary evaluations. In 2009 IEEE International Conference on Data Mining Workshops, ICDMW ’09, pages 138–143. IEEE, 2009.
• Gregory Valiant and Paul Valiant. An automatic inequality prover and instance optimal identity testing. In Proceedings of the 55th Annual IEEE Symposium on Foundations of Computer Science, FOCS ’14, pages 51–60, Washington, DC, USA, 2014. IEEE Computer Society.
• [WHW+16] Shaowei Wang, Liusheng Huang, Pengzhan Wang, Yiwen Nie, Hongli Xu, Wei Yang, Xiang-Yang Li, and Chunming Qiao. Mutual information optimally local private discrete distribution estimation. arXiv preprint arXiv:1607.08025, 2016.
• [WKLK18] Yue Wang, Daniel Kifer, Jaewoo Lee, and Vishesh Karwa. Statistical approximating distributions under differential privacy. The Journal of Privacy and Confidentiality, 8(1):1–33, 2018.
• [WLF16] Yu-Xiang Wang, Jing Lei, and Stephen E. Fienberg. A minimax theory for adaptive data analysis. arXiv preprint arXiv:1602.04287, 2016.
• [WLK15] Yue Wang, Jaewoo Lee, and Daniel Kifer. Revisiting differentially private hypothesis tests for categorical data. arXiv preprint arXiv:1511.03376, 2015.
• Min Ye and Alexander Barg. Optimal schemes for discrete distribution estimation under locally differential privacy. IEEE Transactions on Information Theory, 64(8):5662–5676, 2018.
• 2. If P = N (μ, Id×d), then: (a) If X passes the first two checks at line 5 and 11, then X = Xwith high probability, so T (X) = T (X ).