We have a file with a million pages (N = 1,000,000 pages), and we want to...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
We have a file with a million pages (N = 1,000,000 pages), and we want to sort it using external merge sort. Assume the simplest algorithm, that is, no double buffering, no blocked I/O, and quicksort for in-memory sorting. Let B denote the number of buffers. How many passes are needed to sort the file with N = 1,000,000 pages with 6 buffers? Consider the following B+tree. 2 10 20 30 H«""-²" ¹ 11 12 13 31 21 23 32 When answering the following question, be sure to follow the procedures described in class and in your textbook. You can make the following assumptions: A left pointer in an internal node guide towards keys < than its corresponding key, while a right pointer guides towards keys 2. • A leaf node underflows when the number of keys goes below [ (d-1)/2] e. • An internal node(root node) underflows when the number of pointers goes below d/2. How many pointers (parent-to-child and sibling-to-sibling) do you chase to find all keys between 9 and 19* ? 3.) Answer the following questions for the hash table of Figure 2. Assume that a bucket split occurs whenever an overflow page is created. h0(x) takes the rightmost 2 bits of key x as the hash value, and h1(x) takes the rightmost 3 bits of key x as the hash value n₁ 2 000 001 010 011 ho 00 01 10 11 Level=0, N=4 Next=0 PRIMARY PAGES 64 44 9 25 5 10 31 15 7 3 Figure 2: Linear Hashing What is the largest key less than 25 whose insertion will cause a split? 4.) Consider a sparse B+ tree of order d = 2 containing the keys 1 through 20 inclusive. How many nodes does the B+ tree have? 5.) Consider the schema R(a,b), S(b,c), T(b,d), U(b,e). Below is an SQL query on the schema: SELECT R.a FROM R, S, WHERE R.b = S.b AND S.b = U.b AND U.e = 6 For the following SQL query, I have given two equivalent logical plans in relational algebra such that one is likely to be more efficient than the other: I. πа(σс=3(R+b=b (S))) II. πа(R+b=b oc=3(S))) Which plan is more efficient than the other? 6.) In the vectorized processing model, each operator that receives input from multiple children requires multi-threaded execution to generate the Next() output tuples from each child. True or False? Explain your reason. 7.) How can you optimize a Hash join algorithm? 8.) Consider the following SQL query that finds all applicants who want to major in SE, live Seattle, and go to a school ranked better than 10 (i.e., rank < 10). Relation Applicants (id, name, city, sid) Schools (sid, sname, srank) Major (id, major) SELECT A.name FROM Applicants A, Schools S, Major M WHERE A.sid = S.sid AND A.id = M.id AND A.city='Seattle' AND S.rank < 10 AND M.major = 'CSE' Assuming: • Each school has a unique rank number (srank value) between 1 and 100. • There are 20 different cities. • Applicants.sid is a foreign key that references Schools.sid. • Major.id is a foreign key that references Applicants.id. • There is an unclustered, secondary B+ tree index on Major.id and all index pages are in memory. You as an analyst devise the following query plan for this problem above: (One-the-fly) (1) o Cardinality Number of pages Primary key 2,000 100 id 100 10 sid 3,000 200 (id, major) (Index nested loop) (Sort-merge) (One-the-fly) (5) major = 'CSE' city='Seattle' sid= sid (2) o (6) л name id = id srank < 10 (4) 4 Major (B+ tree index on id) Applicants Schools (File scan) (File Scan) What is the cost of the query plan below? Count only the number of page I/Os. 9.) Consider relations R(a, b) and S(a, c, d) to be joined on the common attribute a. Assume that there are no indexes available on the tables to speed up the join algorithms. There are B = 75 pages in the buffer • Table R spans M = 2,400 pages with 80 tuples per page • Table S spans N = 1,200 pages with 100 tuples per page Answer the following question on computing the I/O costs for the joins. You can assume the simplest cost model where pages are read and written one at a time. You can also assume that you will need one buffer block to hold the evolving output block and one input block to hold the current input block of the inner relation. 10.) 11.) A.) Assume that the tables do not fit in main memory and that a high cardinality of distinct values hash to the same bucket using your hash function h1. What approach will work best to rectify this? B.) I/O cost of a Block nested loop join with R as the outer relation and S as the inner relation Given a full binary tree with 2n internal nodes, how many leaf nodes does it have? Consider the following cuckoo hashing schema below: Both tables have a size of 4.The hashing function of the first table returns the fourth and third least significant bits: h1(x) = (x >>2) & 0b11. The hashing function of the second table returns the least significant two bits: h2(x) = x & 0b11. When inserting, try table 1 first. When replacement is necessary, first select an element in the second table. The original entries in the table are shown in the figure below. TABLE 1 12 TABLE 2 13 What sequence will the above sequence produce? Choose the appropriate option below: hashing d.) I don't know a.) Insert 12, Insert 13 b.) Insert 13, Insert 12 c.) None of the above. You cannot have more than 1 Hash table in Cuckoo 5 We have a file with a million pages (N = 1,000,000 pages), and we want to sort it using external merge sort. Assume the simplest algorithm, that is, no double buffering, no blocked I/O, and quicksort for in-memory sorting. Let B denote the number of buffers. How many passes are needed to sort the file with N = 1,000,000 pages with 6 buffers? Consider the following B+tree. 2 10 20 30 H«""-²" ¹ 11 12 13 31 21 23 32 When answering the following question, be sure to follow the procedures described in class and in your textbook. You can make the following assumptions: A left pointer in an internal node guide towards keys < than its corresponding key, while a right pointer guides towards keys 2. • A leaf node underflows when the number of keys goes below [ (d-1)/2] e. • An internal node(root node) underflows when the number of pointers goes below d/2. How many pointers (parent-to-child and sibling-to-sibling) do you chase to find all keys between 9 and 19* ? 3.) Answer the following questions for the hash table of Figure 2. Assume that a bucket split occurs whenever an overflow page is created. h0(x) takes the rightmost 2 bits of key x as the hash value, and h1(x) takes the rightmost 3 bits of key x as the hash value n₁ 2 000 001 010 011 ho 00 01 10 11 Level=0, N=4 Next=0 PRIMARY PAGES 64 44 9 25 5 10 31 15 7 3 Figure 2: Linear Hashing What is the largest key less than 25 whose insertion will cause a split? 4.) Consider a sparse B+ tree of order d = 2 containing the keys 1 through 20 inclusive. How many nodes does the B+ tree have? 5.) Consider the schema R(a,b), S(b,c), T(b,d), U(b,e). Below is an SQL query on the schema: SELECT R.a FROM R, S, WHERE R.b = S.b AND S.b = U.b AND U.e = 6 For the following SQL query, I have given two equivalent logical plans in relational algebra such that one is likely to be more efficient than the other: I. πа(σс=3(R+b=b (S))) II. πа(R+b=b oc=3(S))) Which plan is more efficient than the other? 6.) In the vectorized processing model, each operator that receives input from multiple children requires multi-threaded execution to generate the Next() output tuples from each child. True or False? Explain your reason. 7.) How can you optimize a Hash join algorithm? 8.) Consider the following SQL query that finds all applicants who want to major in SE, live Seattle, and go to a school ranked better than 10 (i.e., rank < 10). Relation Applicants (id, name, city, sid) Schools (sid, sname, srank) Major (id, major) SELECT A.name FROM Applicants A, Schools S, Major M WHERE A.sid = S.sid AND A.id = M.id AND A.city='Seattle' AND S.rank < 10 AND M.major = 'CSE' Assuming: • Each school has a unique rank number (srank value) between 1 and 100. • There are 20 different cities. • Applicants.sid is a foreign key that references Schools.sid. • Major.id is a foreign key that references Applicants.id. • There is an unclustered, secondary B+ tree index on Major.id and all index pages are in memory. You as an analyst devise the following query plan for this problem above: (One-the-fly) (1) o Cardinality Number of pages Primary key 2,000 100 id 100 10 sid 3,000 200 (id, major) (Index nested loop) (Sort-merge) (One-the-fly) (5) major = 'CSE' city='Seattle' sid= sid (2) o (6) л name id = id srank < 10 (4) 4 Major (B+ tree index on id) Applicants Schools (File scan) (File Scan) What is the cost of the query plan below? Count only the number of page I/Os. 9.) Consider relations R(a, b) and S(a, c, d) to be joined on the common attribute a. Assume that there are no indexes available on the tables to speed up the join algorithms. There are B = 75 pages in the buffer • Table R spans M = 2,400 pages with 80 tuples per page • Table S spans N = 1,200 pages with 100 tuples per page Answer the following question on computing the I/O costs for the joins. You can assume the simplest cost model where pages are read and written one at a time. You can also assume that you will need one buffer block to hold the evolving output block and one input block to hold the current input block of the inner relation. 10.) 11.) A.) Assume that the tables do not fit in main memory and that a high cardinality of distinct values hash to the same bucket using your hash function h1. What approach will work best to rectify this? B.) I/O cost of a Block nested loop join with R as the outer relation and S as the inner relation Given a full binary tree with 2n internal nodes, how many leaf nodes does it have? Consider the following cuckoo hashing schema below: Both tables have a size of 4.The hashing function of the first table returns the fourth and third least significant bits: h1(x) = (x >>2) & 0b11. The hashing function of the second table returns the least significant two bits: h2(x) = x & 0b11. When inserting, try table 1 first. When replacement is necessary, first select an element in the second table. The original entries in the table are shown in the figure below. TABLE 1 12 TABLE 2 13 What sequence will the above sequence produce? Choose the appropriate option below: hashing d.) I don't know a.) Insert 12, Insert 13 b.) Insert 13, Insert 12 c.) None of the above. You cannot have more than 1 Hash table in Cuckoo 5
Expert Answer:
Related Book For
Database management systems
ISBN: 978-0072465631
3rd edition
Authors: Raghu Ramakrishan, Johannes Gehrke, Scott Selikoff
Posted Date:
Students also viewed these databases questions
-
What factor exists when external obsolescence is caused by factors not on the subject property?
-
The following additional information is available for the Dr. Ivan and Irene Incisor family from Chapters 1-5. Ivan's grandfather died and left a portfolio of municipal bonds. In 2012, they pay Ivan...
-
Suppose that you just finished inserting several records into a heap file and now want to sort those records. Assume that the DBMS uses external sort and makes efficient use of the available buffer...
-
At fiscal year-end December 31, 2015, Shop-World had the following assets and liabilities on its balance sheet (in millions): Current liabilities ............ $9,459 Long-term debt .................
-
A 30-kg uniform thin panel is placed in a truck with end A resting on a rough horizontal surface and end B supported by a smooth vertical surface. Knowing that the deceleration of the truck is 4...
-
Construct a Simulink model to plot the solution of the following equations for 0 t 2 where f(t) = 3t. Use the Ramp block in the Sources library. ij = -6x1 + 4x2 i2 = 5x - 7x, +f(t)
-
Find the value of the put for Mr. Smith described in Example 15.9. Example 15.9 (A foreign currency put) Mr. Smith, a successful but cautious U.S. businessman, has sold a product to a Japanese firm,...
-
Classification of variable and fixed costs Classify each of the following as a variable or fixed cost with respect to a unit of product that is sold: a. Commissions paid to sales personnel. b....
-
If you deposit $5,000 at the end of each of the next 20 years into an account paying 10.8 percent interest, how much money will you have in the account in 20 years? How much will you have if you make...
-
Aggressive versus Conservative seasonal funding strategy Dynabase Tool has forecast its total funds requirement for the coming year as shown in the following table. a. Divide the firm's monthly funds...
-
From the operation's budget, the business expects to average 18,000 customers per month with average monthly expenses (excluding food and beverages costs) of 30,000 and a target monthly profit of...
-
Why are good accounting practices important for a business? Explain in details.
-
Orabone Company purchases $240,000 of inventory during the period and sells $72,000 of it for $120,000. Beginning of the period inventory was $12,000. What is the company's inventory balance to be...
-
Disney Plus is a subscription based streaming service. Use the information below to calculate Customer Lifetime Value of subscribers using the following formula. Show your work. Customers pay...
-
Write a 4-6 page paper applying concepts from positive psychology to a particular environment, assessing the strengths and limitations of that application, and citing research to support that...
-
This year, Brooke sold shares of stock in two corporations that met the requirements of 1244 for a loss. Her loss on Royal Corporation stock was $140,000, and her loss on Tonic Corporation stock was...
-
After entering a transaction in QuickBooks Online, how does aclient check on the profitability of a project? By selecting Project report from the Overview tab By selecting Project Profitability from...
-
Why should you not model a decision variable as a random variable with a probability distribution?
-
SQL supports four isolation-levels and two access-modes, for a total of eight combinations of isolation-level and access-mode. Each combination implicitly defines a class of transactions; the...
-
Let R be decomposed into R1, R2, . . ., Rn. Let F be a set of FDs on R. 1. Define what it means for F to be preserved in the set of decomposed relations. 2. Describe a polynomial-time algorithm to...
-
Modern disk drives store more sectors on the outer tracks than the inner tracks. Since the rotation speed is constant, the sequential data transfer rate is also higher on the outer tracks. The seek...
-
Consider a wheel with \(n\) sectors. If the wheel pointer lands on sector \(i\), the payoff obtained is \(r_{i}\) for every unit bet on that sector. The chance of landing on sector \(i\) is \(p_{i},...
-
You are managing a pension fund with a goal of maximizing the long-term growth rate. There are three assets available. Asset 1 has a risk-free return of 5%. Assets 2 and 3 each are driven by...
-
Suppose there are \(n\) stocks. Each of them has a price that is governed by geometric Brownian motion. Each has \(v_{i}=15 \%\) and \(\sigma_{i}=40 \%\). However, these stocks are correlated, and...
Study smarter with the SolutionInn App