Question: Your colleague at work has been tasked to write a query to sample 50 customers of the company. The total customer base is 1000. Erroneously,

Your colleague at work has been tasked to write a query to sample 50 customers of the company. The total customer base is 1000. Erroneously, your colleague's query sampled with replacement instead of without replacement, which means that there is a chance that the same customer can be sampled twice and hence waste investigation resources. Making an argument that the probability of sampling a same user twice is 1/1000 1/1000 = 1/1000000, your colleague concludes that since the probability of that event is rare, he does not have to change his code. Is that argument true or false? If true, explain in words why the calculation for the probability is correct in agreement with your colleague. If false, provide a correct calculation and solution so as to convince your colleague that he should change his code.

Hint: Either support your colleague that he/she does not have to change his sampling code because collision risk is low, or convince him/her to change the sampling code because collision risk is moderate/high. Is the point about collision risk really:

1. risk of one particular user ID being sampled exactly twice,

2. or any user ID being sampled exactly twice,

3. or one particular user ID being sampled at least twice,

4. or any user ID can be sampled at least twice?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!