In this problem, we will combine ideas from Count- min sketch for finding heavy-hitters with the...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
In this problem, we will combine ideas from Count- min sketch for finding heavy-hitters with the Alon-Matias-Szegedy algorithm for estimating the 2 frequency moment of a stream. This will allow us to estimate heavy hitters of a stream with a tighter guarantee in certain cases. Recall that in Count-Min Sketch, we maintained d hash functions h,..., hd, corresponding to d hash tables, each of size w. For the datum that appears at time t, (it, ct) where it is the identifier, and ct is a count, for each j = [d], we increment a counter C; in entry h; (it) of the jth hash table by c. At the end of the stream, for a given identifier i, we can return fi = minjeld] C; (h; (i)) to get an estimate of fi - Etiti. In particular, setting w = 0(1/E) and d = O(log(1/8)), with probability at least 1-8, this will give an estimate fi - fil < F, where F = fi (we assume that fi 20 for all i). Consider making the following changes to the algorithm. Instead of storing just d hash functions, we instead store 2d hash functions. The second set of hash functions, 91,..., 9d maps to the range {+1}. The modification to counter C; at time t is still at entry h; (it), but now we increment it by gj(it)ct. Finally, our estimate fi is now median jeld] 95 (i)C; (h; (i)). We will obtain a guarantee which is in terms of VF2, where F = f. Let fij = 9; (i)C,(h, (i)). (a) For some given i and j, compute E[fi]. (b) For some given i and j, upper bound Var[fij]. (c) Given these two quantities, choose values of d and w, upper-bounding the probability that fij-fil 22 by a constant, and (in turn) upper-bounding the probability that fi - fil EVF by 8. (d) Compare this type of guarantee with that of Count-Min Sketch. When is each guarantee better? Give a set of frequencies (i.e., a set of fi's) illustrating where one might be better than the other. In this problem, we will combine ideas from Count- min sketch for finding heavy-hitters with the Alon-Matias-Szegedy algorithm for estimating the 2 frequency moment of a stream. This will allow us to estimate heavy hitters of a stream with a tighter guarantee in certain cases. Recall that in Count-Min Sketch, we maintained d hash functions h,..., hd, corresponding to d hash tables, each of size w. For the datum that appears at time t, (it, ct) where it is the identifier, and ct is a count, for each j = [d], we increment a counter C; in entry h; (it) of the jth hash table by c. At the end of the stream, for a given identifier i, we can return fi = minjeld] C; (h; (i)) to get an estimate of fi - Etiti. In particular, setting w = 0(1/E) and d = O(log(1/8)), with probability at least 1-8, this will give an estimate fi - fil < F, where F = fi (we assume that fi 20 for all i). Consider making the following changes to the algorithm. Instead of storing just d hash functions, we instead store 2d hash functions. The second set of hash functions, 91,..., 9d maps to the range {+1}. The modification to counter C; at time t is still at entry h; (it), but now we increment it by gj(it)ct. Finally, our estimate fi is now median jeld] 95 (i)C; (h; (i)). We will obtain a guarantee which is in terms of VF2, where F = f. Let fij = 9; (i)C,(h, (i)). (a) For some given i and j, compute E[fi]. (b) For some given i and j, upper bound Var[fij]. (c) Given these two quantities, choose values of d and w, upper-bounding the probability that fij-fil 22 by a constant, and (in turn) upper-bounding the probability that fi - fil EVF by 8. (d) Compare this type of guarantee with that of Count-Min Sketch. When is each guarantee better? Give a set of frequencies (i.e., a set of fi's) illustrating where one might be better than the other.
Expert Answer:
Related Book For
Applied Regression Analysis and Other Multivariable Methods
ISBN: 978-1285051086
5th edition
Authors: David G. Kleinbaum, Lawrence L. Kupper, Azhar Nizam, Eli S. Rosenberg
Posted Date:
Students also viewed these programming questions
-
Let A, B be sets. Define: (a) the Cartesian product (A B) (b) the set of relations R between A and B (c) the identity relation A on the set A [3 marks] Suppose S, T are relations between A and B, and...
-
1456HHSC attend all (a) In the quantum teleportation protocol, Alice and Bob are every in possession of one qubit of a couple in the joint country00i + statei. Explain how the protocol works. In...
-
In Problems 1158, perform the indicated operation, and write each expression in the standard form a + bi. 6i 3 - 4i 5
-
A 1500-nF capacitor with circular parallel plates 2.0 cm in diameter is accumulating charge at the rate of 32.0 m C / s at some instant in time. What will be the induced magnetic field strength 10.0...
-
The Football Bowl Subdivision (FBS) level of the National Collegiate Athletic Association (NCAA) consists of over 100 schools. Most of these schools belong to one of several conferences, or...
-
Consider the Gompertz model in Eq. (12.35). Graph the expectation function for \(\theta_{1}=1, \theta_{3}=1\), and \(\theta_{2}=\frac{1}{8}, 1,8,64\) over the range \(0 \leq x \leq 10\). Equation...
-
The following tabulations are actual sales of units for six months and a starting forecast in January. a. Calculate forecasts for the remaining five months using simple exponential smoothing with a =...
-
Custom Cabinetry has one job in process ( Job 1 2 0 ) as of June 3 0 ; at that time, its job cost sheet reports direct materials of $ 7 , 3 0 0 , direct labor of $ 3 , 8 0 0 , and applied overhead of...
-
Jackie serves as the vice president for network development for a large, midwestern healthcare system. She has worked with many rural and semirural hospitals to improve efficiency by offering shared...
-
The Multi-Division Corp. keeps track of information about the budgeted and actual costs on various accounts for its many divisions. The data is currently stored in the following table. COSTS (DIV-NO,...
-
What is the difference between judgment by performance and judgment by structure?
-
Why do sellers generally prefer a Vickrey auction to a regular sealed bid if sellers dont receive the highest bid in the Vickrey auction?
-
Suppose you are an economist for Mattel, manufacturer of the Barbie doll, which was making an unsolicited bid to take over Hasbro, manufacturer of the G.I. Joe doll. a. Would you argue that the...
-
How do the findings of behavioral economics undermine the assumptions of the standard model as to the nature of human beings? (Radical)
-
True or false? If a game has a Nash equilibrium, that equilibrium will be the equilibrium that we expect to observe in the real world.
-
Mickey Division of Mouse Corporation makes satellites. Mickey had sales last year of $1,200,000 and earned $300,000 in income (NOPAT). Its invested capital was $1,000,000. Mickeys profit margin,...
-
Gordon and Lisa estimate that they will need $1,875,000 in 40 years for their retirement years. If they can earn 8 percent annually on their funds, how much do they need to save annually?
-
This problem refers to the 1990 Census data presented in Problem 19 of Chapter 5. In addition to median selected monthly ownership costs (OWNCOST), another independent variable studied was the...
-
This question refers to the U.S. News & World Report mutual fund data presented in Problem 19 in Chapter 17. The variables described in that question were: CAT (fund category): 1 = Aggressive growth;...
-
Using the data from Problem 2 in Chapter 5 and/or the SAS output given here, answer the following questions about the separate straight-line regressions of SBP on QUET for smokers (SMK = 1) and...
-
Technical analysis looks at the demand and supply for securities based on trading volumes and price studies. Charting is a common method used to identify and project price trends in a security. A...
-
Consider a company that earned \($4.00\) per share last year and paid a dividend of \($1.00.\) The firm has maintained a consistent payout ratio over the years and analysts expect this to continue....
-
A cyclical company tends to a. have earnings that track the overall economy. b. have a high price-to-earnings ratio. c. have less volatile earnings than the overall market.
Study smarter with the SolutionInn App