Question: ( a ) What issues are to be considered while selecting a model for applying machine learning in a given problem. ( 4 Marks )

(a) What issues are to be considered while selecting a model for applying machine learning in a given problem.
(4 Marks)
(b)(i) Clearly differentiate between feature selection and feature extraction.
(ii) Describe the forward selection algorithm for implementing the subset selection procedure for dimensionality reduction and specify any requirement(s) and/or limitation(s).
(iii) Given the data in the following table, use PCA to reduce the dimension from 2 to 1 :
\table[[Feature,Example 1,Example 2,Example 3,Example 4],[x1,4,8,13,7],[x2,11,4,5,14]]
(4+6+11 Marks )
Question 2: (25 Marks)
(a) Consider the problem of finding a rule for determining days on which one can enjoy water sport. The rule is to depend on a few attributes like "temp", "humidity", etc. Suppose we have the following data to help us devise the rule. In the data, a value of "1" for "enjoy" means "yes" and a value of "0" indicates "no".
Table 1: Attributes Data.
\table[[Example,Sky,Temp,Humidity,wind,Water,Forecast,Enjoy],[1,Sunny,Warm,Normal,Strong,Warm,Same,1],[2,Sunny,Warm,High,Strong,Warm,Same,1],[3,Rainy,Cold,High,Strong,Warm,Change,0],[4,Sunny,Warm,High,Strong,Cool,Change,1]]
Find the hypothesis space and the version space for the problem.
(6 Marks)
Please Turn Over
Page 2 of 4
(b) Explain cross-validation in machine learning and hence explain the different types of cross-validations along with their limitations, if any.
(c) Given the following data in Table 2, construct the ROC curve of the data. Compute the AUC.
Table 2: Data.
\table[[Threshold,TP,TN,FP,FN],[1,0,25,0,29],[2,7,25,0,22],[3,18,24,1,11],[4,26,20,5,3],[5,29,11,14,0],[6,29,0,25,0],[7,29,0,25,0]]
(6+3 Marks )
Question 3: (25 Marks)
(a) Write down the naive Bayes' algorithm.
(b) How do we use numeric features with Naive Bayes?
(6 Marks)
(c) Find the ML estimate for the parameters representing the mean and that representing the variance in the normal probability function.
(7 Marks)
(d) Let S be a set of examples, A a feature having c different values and let the set of values of A be denoted by Values(A). Define Gain(S,A), Split Information (S,A) and Gain Ratio (S,A).
 (a) What issues are to be considered while selecting a model

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!