Question: Reservoir Sampling. In this question you are required to modify the Reservoir Sampling algorithm to sample K items from a stream of N items, uniformly
Reservoir Sampling. In this question you are required to modify the Reservoir Sampling algorithm to sample K items from a stream of N items, uniformly at random, so that each element has a probability of K/N to appear in the sample. 1. Describe the algorithm for sampling K items uniformly at random from a stream of N items. The algorithm should work in a single pass over the data, reading the items one by one, without prior knowledge of the size of the stream N, and using O(K) of memory (assume the size of an item is fixed). - Do not write code or pseudocode for this part; just explain the logic of the algorithm in English in a simple way. 2. Suppose K=3 and N=5. Define Pr(X=i) as the probability of the i-th item appearing in the sample. Calculate Pr(X=i) by filling in the numbers below, where the number is the probability that item i appears/remains in the sample at the 1st round, and 2nd round is number, etc. If the number does not exist, replace it with "*". 3. Prove that for any K and N, your algorithm produces a uniform sample, that is, for every i,1iN, the i-th element has probability K/N to appear in the sample
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
