Question: Suppose we have a stream of tuples with the schema Grades(university, courseID, studentID, grade) Assume universities are unique, but a courseID is unique only within
Suppose we have a stream of tuples with the schema
Grades(university, courseID, studentID, grade)
Assume universities are unique, but a courseID is unique only within a university (i.e., different universities may have different courses with the same ID, e.g., “CS101”) and likewise, studentID’s are unique only within a university (different universities may assign the same ID to different students). Suppose we want to answer certain queries approximately from a 1/20th sample of the data. For each of the queries below, indicate how you would construct the sample. That is, tell what the key attributes should be.
(a) For each university, estimate the average number of students in a course.
(b) Estimate the fraction of students who have a GPA of 3.5 or more.
(c) Estimate the fraction of courses where at least half the students got “A.”
Step by Step Solution
3.54 Rating (157 Votes )
There are 3 Steps involved in it
To construct samples for the given queries considering that were working with a stream of data and n... View full answer
Get step-by-step solutions from verified subject matter experts
