Question: QUESTION B1. (5 marks) (a) (1 mark) What are the main differences between supervised learning and unsupervised learn- ing? (b) (2 marks) Taking a real-world

 QUESTION B1. (5 marks) (a) (1 mark) What are the main

QUESTION B1. (5 marks) (a) (1 mark) What are the main differences between supervised learning and unsupervised learn- ing? (b) (2 marks) Taking a real-world business problem, explain the main steps for applying machine learning to solve that problem. (c) (1 mark) The training set is given as (L1, y1), ..., (In, Yn). Each y; is continuous, 1 sisn. Here we use the model k-Nearest Neighbor to make a prediction y, for a new instance Iq. Suppose the k neighbors' labels are y1, ..., yk, write the equation of the predicted label of 2. (d) (1 mark) What is the main advantage of using low values of k compared to high values in lazy learning? QUESTION B2. (5 marks) (a) (2 marks) How many combinations do we have for the Bi-variate analysis considering different variable categories? For each combination, describe an analysis approach. (b) (1 mark) Suppose we obtain a dataset from an online website, how can we get an estimate of the accuracy of a learned model? Draw a flow chart to help explain the process. (c) (1 mark) When splitting an entire dataset into training and test sets, we may want to ensure that class proportions are maintained in each selected set. How can we achieve that goal? (d) (1 mark) Describe the k-fold cross-validation algorithm for model selection. QUESTION B1. (5 marks) (a) (1 mark) What are the main differences between supervised learning and unsupervised learn- ing? (b) (2 marks) Taking a real-world business problem, explain the main steps for applying machine learning to solve that problem. (c) (1 mark) The training set is given as (L1, y1), ..., (In, Yn). Each y; is continuous, 1 sisn. Here we use the model k-Nearest Neighbor to make a prediction y, for a new instance Iq. Suppose the k neighbors' labels are y1, ..., yk, write the equation of the predicted label of 2. (d) (1 mark) What is the main advantage of using low values of k compared to high values in lazy learning? QUESTION B2. (5 marks) (a) (2 marks) How many combinations do we have for the Bi-variate analysis considering different variable categories? For each combination, describe an analysis approach. (b) (1 mark) Suppose we obtain a dataset from an online website, how can we get an estimate of the accuracy of a learned model? Draw a flow chart to help explain the process. (c) (1 mark) When splitting an entire dataset into training and test sets, we may want to ensure that class proportions are maintained in each selected set. How can we achieve that goal? (d) (1 mark) Describe the k-fold cross-validation algorithm for model selection

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!