Question: QUESTION B1. (5 marks) (a) (1 mark) What are the main differences between supervised learning and unsupervised learn- ing? (b) (2 marks) Taking a real-world

QUESTION B1. (5 marks) (a) (1 mark) What are the main differences between supervised learning and unsupervised learn- ing? (b) (2 marks) Taking a real-world business problem, explain the main steps for applying machine learning to solve that problem. (c) (1 mark) The training set is given as (L1, y1), ..., (In, Yn). Each y; is continuous, 1 sisn. Here we use the model k-Nearest Neighbor to make a prediction y, for a new instance Iq. Suppose the k neighbors' labels are y1, ..., yk, write the equation of the predicted label of 2. (d) (1 mark) What is the main advantage of using low values of k compared to high values in lazy learning? QUESTION B2. (5 marks) (a) (2 marks) How many combinations do we have for the Bi-variate analysis considering different variable categories? For each combination, describe an analysis approach. (b) (1 mark) Suppose we obtain a dataset from an online website, how can we get an estimate of the accuracy of a learned model? Draw a flow chart to help explain the process. (c) (1 mark) When splitting an entire dataset into training and test sets, we may want to ensure that class proportions are maintained in each selected set. How can we achieve that goal? (d) (1 mark) Describe the k-fold cross-validation algorithm for model selection. QUESTION B1. (5 marks) (a) (1 mark) What are the main differences between supervised learning and unsupervised learn- ing? (b) (2 marks) Taking a real-world business problem, explain the main steps for applying machine learning to solve that problem. (c) (1 mark) The training set is given as (L1, y1), ..., (In, Yn). Each y; is continuous, 1 sisn. Here we use the model k-Nearest Neighbor to make a prediction y, for a new instance Iq. Suppose the k neighbors' labels are y1, ..., yk, write the equation of the predicted label of 2. (d) (1 mark) What is the main advantage of using low values of k compared to high values in lazy learning? QUESTION B2. (5 marks) (a) (2 marks) How many combinations do we have for the Bi-variate analysis considering different variable categories? For each combination, describe an analysis approach. (b) (1 mark) Suppose we obtain a dataset from an online website, how can we get an estimate of the accuracy of a learned model? Draw a flow chart to help explain the process. (c) (1 mark) When splitting an entire dataset into training and test sets, we may want to ensure that class proportions are maintained in each selected set. How can we achieve that goal? (d) (1 mark) Describe the k-fold cross-validation algorithm for model selection
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
