Question: Suppose I have a relation Grades(student_id, assignment_id, score). I have 150 students and 20 assignments. I would grade all submissions of one assignment based on
Suppose I have a relation Grades(student_id, assignment_id, score). I have 150 students and 20 assignments. I would grade all submissions of one assignment based on the submission order, and then insert the records. As a result, based on my insertion nature, this relation is not sorted on student_id, but sorted on the assignment_id. I choose heap file as my file organization. My page is quite small it can only store 20 records, or 200 bytes in one page. The SearchKeySize is 4 bytes and PointerSize is 2 bytes. My buffer size is also small, 5 pages.
If my most frequent query is to find grades of individual students (e.g., select assignment_id, score from grades where student_id=3347;)
I want to improve the I/O cost. I am debating if I need to build index for student_id, or to sort based on student_id. So I need to do some estimation. Please help me by answering the following questions.
What is the I/O cost of multi-way merge sort (aka, external sort) if I sort the relation after I enter all records? Explain the process.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
