Question: various problems. Questions with the mark , are required for graduate students and bonus for undergraduate students Learning problems Datasets of three problems are provided

various problems. Questions with the mark "", are required for graduate

various problems. Questions with the mark "", are required for graduate students and bonus for undergraduate students Learning problems Datasets of three problems are provided to you. Please download the data.zip file from the Canvas system. Each problem will have the following files: - A "problem.mat" file, which will have the examples and attributes. One attribute will be an " id" attribute which will be useful for identifying examples, but which you will not use when learning. - A "problem.info" file, which gives additional information about the problem, such as how the data was generated. This is for your information only and does not affect the implementation in any way. Programming Requirements You do not have to implement the algorithms but are expected to understand and know how to use them. You have been provided with the initial python code to read in the data. - Please use Python3, not Python2. - Define a function for each question, such as def_I_Ia a0. - Use sklearn as the machine learning library and NumPy to process data. If not mentioned in the question, use default settings. - Use the last four digits of your student ID as the random state seed for both data split and the method initialization. ( 10 points if this requirement is not followed) 1. Decision Tree Learner (60) points) 1) Split each of the three datasets into training and testing subsets randomly by the ratio 80/20. a. On each dataset, train the decision tree classifiers with entropy as the node selection criteria, what is the prediction accuracy of each classifier? What are the height and the number of leaves for each tree? b. For voting, what is the prediction accuracy of the classifier with gini as the node selection criteria? Which feature provides the highest gimi value during the first node selection process? c. For spam, train the decision tree classifier with entropy as the node selection criteria but with different depths. Plot the accuracy as the depth of the tree is increased from 1 to 50 (the x axis is the depth of the tree and y-axis is the accuracy of the model). What have you observed from the graph? 2) Split the volcanoes dataset into training and testing subsets by the ratio 90/10,70/30,60/40 and 40/60, and report the accuracies. Which partition shows the highest precision and why

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

I only need a answer for question 1&2. please be clear as possible. (1-2 pages the answer) ISSUES IN ACCOUNTING EDUCATION Vol. 30, No. 3 2015 pp. 233-248 American Accounting Association DOI:...

I only need a answer for question 1 & 2. please be specific and precise. ISSUES IN ACCOUNTING EDUCATION Vol. 30, No. 3 2015 pp. 233-248 American Accounting Association DOI: 10.2308/iace-51084 A Gain...

Please be serious to my work or you don't need to accept it. Here is the requirement and Case study you need to read about. Thank you ISSUES IN ACCOUNTING EDUCATION Vol. 28, No. 3 2013 pp. 599-615...

REQUIREMENTS Prepare responses to the following requirements. When necessary, assume a risk-adjusted discount rate of 12 percent, and a forecasted effective income tax rate of 35 percent for the...

1)Financial Reporting - The Procter & Gamble The financial statements of P&G are presented in Appendix B. The companys complete annual report, including the notes to the financial statements, can be...

KINGS OWN INSTITUTE* Success in Higher Education ICT106 DATA COMMUNICATIONS AND NETWORKS T223 Page 1 of 18 AUSTRALIAN INSTITUTE OF BUSINESS AND MANAGEMENT PTY LTD ABN: 72 132 629 979 CRICOS 03171A...

This is a individual assignment, which is due on 27th of jan, wed, 1pm. Can u help me to do it? Financial Accounting 2 ACG 27, Study Period 4, 2015 Case Study for Annual Report Assignment The...

Please read attached document and answer question 3. Here is the question as well: 3. Operating cash flow is often referred to as the lifeblood of a firm. The vaccine makers received cash up-front...

Provide your answer to the following question contained with theAccounting for the Public Interest: A Revenue Recognition Dilemma case: 5. After careful consideration of the public interest...

Pierre Legere was a marketing representative for Compu-plus, a consulting firm that provided computer services to various government departments in New Brunswick. As a marketing representative, he...

An ambulance service receives an average of 15 calls per day during the time period 6 p.m. to 6 a.m. for assistance. For any given day what is the probability that fewer than 10 calls will be...

The data below are for a population with N = 10: 7 5 6 6 6 4 8 6 9 3 a. Calculate the population mean. b. Calculate the population standard deviation.

The first scenario will be a Verbal Judo scenario in which your scenario follows the standard Verbal Judo interaction: You need to ask somebody to modify their behavior either to do something or to...

5. Structure your speech to make it easy to listen to

1. LaunchPad for Real Communication offers key term videos and encourages selfassessment through adaptive quizzing. Go to bedfordstmartins.com/realcomm to get access to: LearningCurve Adaptive...

1. Describe the goals of informative speaking