Question: One of the most important techniques in data analysis and machine learning is mean esti- mation. It is used a subroutine in essentially every task.

One of the most important techniques in dataOne of the most important techniques in data
One of the most important techniques in data analysis and machine learning is mean esti- mation. It is used a subroutine in essentially every task. In this question, we will explore how to incorporate differential privacy into mean estimation. En route, we will explore the Laplace mechanism, which is one of the fundamental tools in building differentially private algorithms. Let S = {X1, . . .Xn} be i.i.d. samples from a Bernoulli distribution with unknown mean p. Recall, from HW5, that the sample mean MS) : i: a; (2) 3:68 satises |pn p| S crfl/2 with probability 0.99 for some constant c. In order to incorporate privacy, the main idea is to add noise to the estimator Equation (2). For the noise distribution, we will use the Laplace distribution, which has density given by 13mm) = gexp (J33 b M) We will denote this distribution as Lap (P575)- The mean of the distribution is ,u and the variance is 2b2. The differentially private estimator is given by 166,". (S) =pn(3) + Y where Y is sampled from Lap (0, i). Here 6 is a. parameter that will control the privacy. (a) (1 point) Let 5'1 and 32 be two data sets with 7?. binary samples ({0,1}-valued) each. Additionally, also assume that 31 and 32 differ only in one item. More precisely, we can construct 32 by removing one element from 31 and adding another binary value (0 or 1). Show that the sample means for the two sets are close. Specically, show: 1 lpn(51) Pn(32)| S 5- (3) This is referred to as pn having sensitivity n'l. (b) (1 point) For any xed S, explain why 156,\" (S) is distributed according to a Laplace distribution. What are the corresponding parameters? (c) (2 points) First, we will show that the above estimator is still fairly accurate. Show that with probability 0.99 (over the sampling of the noise), for every S, we have A 20 lpn (S) _pe,n (SH 5 611' You may nd it especially useful to apply a concentration inequality we learned about in class. (d) In this part, we will see that the mechanism is c-differentially private. Let us recall the definition of differential privacy in this context. An estimator g is e-differentially private if for all sets A C R, we have Prig(S1) E A]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!