Question: Please code in R For full points please make sure to include the following items in the pdf file: (a) the lines of code that
Please code in R


For full points please make sure to include the following items in the pdf file: (a) the lines of code that you took to arrive at your answers (b) the answer to each of the questions (c) a screenshot of any requested plots. 1. Decision Trees: A survey was sent to the employees of a large company to ask them the following questions: - Do you work in the data analytics department? (Y or N) - Are you above the age of 30 ? (Y or N) - Have you spent more than 5 years in this company? (Y or N ) - Is your current gross income more than USD 50,000 per year? (Y or N ) The following table summarizes the responses to the survey. For each entry, "Number of Instances" represents the number of respondents having the corresponding values for the attributes Analytics Department, Age >30, and Tenure >5. Given the data above, answer the following questions: (a) Find support and confidence for the rule: if Analytics Department =Y Then Income >50K (b) Find support and confidence for the rule: if Analytics Department =Y and Tenure >5 Then Income >50K (c) Using the 1-rule method discussed in class, find the relevant sets of classification rules for the target variable by testing each of the input attributes Analytics Department, Age >30, and Tenure >5. Which of these three sets of rules has the lowest misclassification rate? (d) Considering Income >50K as the target variable, which of the attributes would you select as the root in a decision tree that is constructed using the information gain impurity measure? (e) Use the Gini index impurity measure and construct the full decision tree for this data set. For full points please make sure to include the following items in the pdf file: (a) the lines of code that you took to arrive at your answers (b) the answer to each of the questions (c) a screenshot of any requested plots. 1. Decision Trees: A survey was sent to the employees of a large company to ask them the following questions: - Do you work in the data analytics department? (Y or N) - Are you above the age of 30 ? (Y or N) - Have you spent more than 5 years in this company? (Y or N ) - Is your current gross income more than USD 50,000 per year? (Y or N ) The following table summarizes the responses to the survey. For each entry, "Number of Instances" represents the number of respondents having the corresponding values for the attributes Analytics Department, Age >30, and Tenure >5. Given the data above, answer the following questions: (a) Find support and confidence for the rule: if Analytics Department =Y Then Income >50K (b) Find support and confidence for the rule: if Analytics Department =Y and Tenure >5 Then Income >50K (c) Using the 1-rule method discussed in class, find the relevant sets of classification rules for the target variable by testing each of the input attributes Analytics Department, Age >30, and Tenure >5. Which of these three sets of rules has the lowest misclassification rate? (d) Considering Income >50K as the target variable, which of the attributes would you select as the root in a decision tree that is constructed using the information gain impurity measure? (e) Use the Gini index impurity measure and construct the full decision tree for this data set
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
