Question: 2 . Approximating the Median in a Data Stream ( 8 points ) Given a set S [ n ] of m distinct values and
Approximating the Median in a Data Stream points
Given a set S n of m distinct values and a value x we define
rankSx :y in S : y x
ie the number of values in S that are less or equal to x We say x is an approximate median if
m rankSxm
points Consider the following algorithm for sampling an element from a stream x x xm
where you may assume throughout this question that all values in the stream are distinct:
a Initialize s x
b For i m: with probability i update s xi
c Return s
Prove that at the end of the stream, s is equally likely to be any of the elements in the
stream, ie s is chosen uniformly from the set of elements in the stream. Note that this
method doesnt need to know the value of m in advance.
points Consider sampling r elements uniformly and independently at random with replacement
from the stream and let Zt be the random variable corresponding to the number
of samples that are less or equal to zt where zt is the tth smallest element in the stream.
Compute the expectation and variance of Zt
points Consider an algorithm that samples r elements uniformly and independently at
random with replacement from the data stream and returns the median of the sampled
elements. How large must r be such that the output of this algorithm is an approximate
median with probability at least You may assume that and give your answer
in bigO notation. Hint: Consider the random variables Zm and Zm
points Another way to achieve uniform sampling is for each i in m to randomly pick a
value yi is uniformly from Then the stream element xi where i arg minj yj is uniformly
from the set x x xm However, suppose at the end of the stream we are given a value
s in m and now need to return a random value in the set xs xs xm It suffices
to return xi where i arg minsjm yj Describe an algorithm that uses Ologm space in
expectation to output arg minsjm yj The algorithm does not know s while processing the
stream. Approximating the Median in a Data Stream points
Given a set Ssub of distinct values and a value we define
::
ie the number of values in that are less or equal to We say is an approximate median if
points Consider the following algorithm for sampling an element from a stream dots,
where you may assume throughout this question that all values in the stream are distinct:
a Initialize
b For dots, : with probability update
c Return
Prove that at the end of the stream, is equally likely to be any of the elements in the
stream, ie is chosen uniformly from the set of elements in the stream. Note that this
method doesn't need to know the value of in advance.
points Consider sampling elements uniformly and independently at random with re
placement from the stream and let be the random variable corresponding to the number
of samples that are less or equal to where is the th smallest eleme.... CHECK THE PICTURE ATTACHED. THank you
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
