Question: Please code in R For full points please make sure to include the following items in the pdf file: (a) the lines of code that

Please code in R

Please code in R For full points please make sure to includethe following items in the pdf file: (a) the lines of code

For full points please make sure to include the following items in the pdf file: (a) the lines of code that you took to arrive at your answers (b) the answer to each of the questions (c) a screenshot of any requested plots. 1. Decision Trees: A survey was sent to the employees of a large company to ask them the following questions: - Do you work in the data analytics department? (Y or N) - Are you above the age of 30 ? (Y or N) - Have you spent more than 5 years in this company? (Y or N ) - Is your current gross income more than USD 50,000 per year? (Y or N ) The following table summarizes the responses to the survey. For each entry, "Number of Instances" represents the number of respondents having the corresponding values for the attributes Analytics Department, Age >30, and Tenure >5. Given the data above, answer the following questions: (a) Find support and confidence for the rule: if Analytics Department =Y Then Income >50K (b) Find support and confidence for the rule: if Analytics Department =Y and Tenure >5 Then Income >50K (c) Using the 1-rule method discussed in class, find the relevant sets of classification rules for the target variable by testing each of the input attributes Analytics Department, Age >30, and Tenure >5. Which of these three sets of rules has the lowest misclassification rate? (d) Considering Income >50K as the target variable, which of the attributes would you select as the root in a decision tree that is constructed using the information gain impurity measure? (e) Use the Gini index impurity measure and construct the full decision tree for this data set. For full points please make sure to include the following items in the pdf file: (a) the lines of code that you took to arrive at your answers (b) the answer to each of the questions (c) a screenshot of any requested plots. 1. Decision Trees: A survey was sent to the employees of a large company to ask them the following questions: - Do you work in the data analytics department? (Y or N) - Are you above the age of 30 ? (Y or N) - Have you spent more than 5 years in this company? (Y or N ) - Is your current gross income more than USD 50,000 per year? (Y or N ) The following table summarizes the responses to the survey. For each entry, "Number of Instances" represents the number of respondents having the corresponding values for the attributes Analytics Department, Age >30, and Tenure >5. Given the data above, answer the following questions: (a) Find support and confidence for the rule: if Analytics Department =Y Then Income >50K (b) Find support and confidence for the rule: if Analytics Department =Y and Tenure >5 Then Income >50K (c) Using the 1-rule method discussed in class, find the relevant sets of classification rules for the target variable by testing each of the input attributes Analytics Department, Age >30, and Tenure >5. Which of these three sets of rules has the lowest misclassification rate? (d) Considering Income >50K as the target variable, which of the attributes would you select as the root in a decision tree that is constructed using the information gain impurity measure? (e) Use the Gini index impurity measure and construct the full decision tree for this data set

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!