Question: Exam ProblemsF 1 1 Problem 3 UG - Choosing attributes in DT [ 3 5 ] { 6 / 7 } You are an underwriting

Exam ProblemsF11Problem 3 UG - Choosing attributes in DT [35]{6/7}
You are an underwriting data analyst at a Life Insurance company. You are asked to build a model to predict risk levels of life insurance applicants based on a combination of indicators. Given a data table below. Using entropy as the measure and construct the first level of the DT. consider only multi-outcome splits for nominal attributes and only binary splits for interval attributes.
\table[[Customer,\table[[Number of Pre-existing],[conditions]],Income ($1000),Gender,Risk],[1,3-5,75,Male,High],[2,0-2,50,non-binary,Low],[3,6+,95,Female,Low],[4,3-5,54,Female,Low],[5,0-2,100,Female,LoW],[6,6+,45,Male,High],[7,0-2,65,Male,Low],[8,3-5,75,Female,High]]
Parent measure [3]
For each good attribute/split alternative show the following
a. Measure for each child [6]
b. Combined children measure [2]
c. Gain [2]
Winning attribute/split combination and its gain [2]
problem 2 UG - GINI Index [25]{15}
You are a finance analyst at Company ABC. Your team is working on a binary decision tree model to predict the Central bank's decision on interest rate given today's economic environment. Your team has collectively decided to use the GINI index as the quality measure. The current node of the DT contains the records in the table below. Your manager asked you to determine which BINARY SPLIT of values for attribute "Inflation" into groups is the best with respect to GINI?
\table[[Record ID,Inflation,Interest Rate Decision],[1,High,Raise rates],[2,Medium,Lower rates],[3,Medium,Lower rates],[4,Low,Lower rates],[5,Medium,Raise rates],[6,Low,Lower rates],[7,Medium,Raise rates],[8,High,Raise rates],[9,Low,Lower rates],[10,Low,Raise rates],[11,Medium,Lower rates],[12,Medium,Raise rates],[13,High,Raise rates],[14,Medium,Raise rates]]
The results for the following intermediate steps must be given:
Parent measure 4
For each choice of a split
a. Measure for each child [3]
b. Combined children measure [3]
c. Gain [1]
Final result and conclusion [1]
Hint: think about how many alternative binary splits make sense if we assume that inflation is a meaningful attribute for the decision. Is inflation a nominal, ordinal, or interval attribute here?
3
Problem 1 UG - PCA [30]{1/s}
Given 5 data instances each with 3 attributes:
The result for the following must have intermediate steps:
Multivariate mean (i.e. vector of mean for each attribute)[2]
Centered data matrix [3]
Covariance matrix of centered data (use unbiased)[5]
Eigenvalues of covariance matrix [7]
Principal components and transformation matrix [7]
a. Show correspondence between eigenvalues and principal components
Compute PCA transformation of the original data matrix [2]
How many principal components are there in total? How much variance is explai
the first principal component? How many principal components should we keep
Exam ProblemsF 1 1 Problem 3 UG - Choosing

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!