Write a Java program to create a decision tree using ID3 algorithm. Use entropy and information...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
Write a Java program to create a decision tree using ID3 algorithm. Use entropy and information gain calculation to create subtrees. To test your program, use the following dataset table. The dataset is used to predict a student's grade in first year computer science programming course based on their records in math, statistics, science and English in high school. Math Statistics Science English (A+) grade in programming? A B A- A No A B A- No A+ B A- Yes B B+ A- Yes B A Yes B A No A+ A Yes A B+ A A B B+ A B+ A+ B+ B B+ A+ B A A A A- A A A A- A A- A+ A A A A+ A+ A A A A+ A+ A A+ No Yes Yes Yes Yes Yes No The program should read the dataset from a text file (CSV format) and prints out the tree structure. Marking Criteria for programming question 4 Marks Description 1 1 1 6 1 One mark for accepting CSV file and reading the training set correctly. One mark for writing the code that calculates the entropy and information gain correctly. One mark for writing the loop condition that checks if a solution is found Six marks for writing the core of the ID3 search, including recursive functions to build the subtrees and exclude visited nodes. One marks for writing the code that prints out the solution correctly. (Submit both the input data (CSV) file and the screenshots showing the outputs of your program) Write a Java program to create a decision tree using ID3 algorithm. Use entropy and information gain calculation to create subtrees. To test your program, use the following dataset table. The dataset is used to predict a student's grade in first year computer science programming course based on their records in math, statistics, science and English in high school. Math Statistics Science English (A+) grade in programming? A B A- A No A B A- No A+ B A- Yes B B+ A- Yes B A Yes B A No A+ A Yes A B+ A A B B+ A B+ A+ B+ B B+ A+ B A A A A- A A A A- A A- A+ A A A A+ A+ A A A A+ A+ A A+ No Yes Yes Yes Yes Yes No The program should read the dataset from a text file (CSV format) and prints out the tree structure. Marking Criteria for programming question 4 Marks Description 1 1 1 6 1 One mark for accepting CSV file and reading the training set correctly. One mark for writing the code that calculates the entropy and information gain correctly. One mark for writing the loop condition that checks if a solution is found Six marks for writing the core of the ID3 search, including recursive functions to build the subtrees and exclude visited nodes. One marks for writing the code that prints out the solution correctly. (Submit both the input data (CSV) file and the screenshots showing the outputs of your program)
Expert Answer:
Answer rating: 100% (QA)
Node class implementation public class Node Parent node null if this is the root node private Node parent Map of child nodes key attribute value value child node private Map children public Node thisp... View the full answer
Related Book For
Business Analytics Communicating With Numbers
ISBN: 9781260785005
1st Edition
Authors: Sanjiv Jaggia, Alison Kelly, Kevin Lertwachara, Leida Chen
Posted Date:
Students also viewed these programming questions
-
today you borrowed 100,000 at 10% APR compounded monthly to buy a house. you will repay the mortgage with equal monthly payment over 10 years. First payment is due in a month. a) what is the...
-
Choose a topic that can be applied to one of the following economic concepts. (Possible curves: Lorenz curve, marginal product of labor curve, labor demand and supply curves, utility function and...
-
Comprehensive comparison between Linux Kernel and windows Kernel? Comprehensive comparison between Linux System and windows System? Comprehensive comparison between Linux Distributions and windows...
-
Define pricing practices of tesla INC as well as market structure.
-
Determine the moment of the force F about the Oa axis. Express the result as a Cartesian vector. Given: F = (50 -20 20) N a = 6 m b = 2 m c = 1 m d = 3 m e = 4 m
-
Rewrite as a logarithmic equation. 1 0 ^ y = 3
-
What are the disadvantages of nonprobability sampling techniques?
-
Raintree Cosmetic Company sells its products to customers on a credit basis. An adjusting entry for bad debt expense is recorded only at December 31, the company's fiscal year-end. The 2017 balance...
-
Write a speech on this Thesis. Eating organic foods can have numerous benefits for our health, environment, and economy, and it should be an essential part of our daily diet. Describe in 350 words
-
Base your answers to the following questions on the financial statements for Leons Furniture imited/Meubles Lon Lte in Exhibits 1.27A to 1.27D. In the questions below, the year 2016 refers to Leons...
-
Tina started working as a legal secretary for Harvey, a small-town lawyer, in 1986. In 2011, Tina began suffering stress and took two months off on doctors orders. When she returned, she felt she was...
-
Use the scientific method: (1) Based on your observations of your environment, (2) develop a question,(3) hypothesize the answer, (4) predict the consequences if your hypothesis is correct, (5) test...
-
According to a study, 73% of all males between the ages of 18 and 24 live at home. (Unmarried college students living in a dorm are counted as living at home.) Suppose that a survey is administered...
-
The probability that the number of people with blood type O-negative is between 18 and 27. A discrete random variable is given. Assume the probability of the random variable will be approximated...
-
The probability that at most 39 households have a gas stove. A discrete random variable is given. Assume the probability of the random variable will be approximated using the normal distribution....
-
The probability that there are exactly 15 defective parts in a shipment. A discrete random variable is given. Assume the probability of the random variable will be approximated using the normal...
-
Research is much concerned with proper fact finding, analysis and evaluation. Do you agree with this statement? Give reasons in support of your answer.
-
What is the difference between direct materials and indirect materials?
-
The accompanying data set contains two predictor variables, x 1 and x 2 , and one numerical target variable, y. A regression tree will be constructed using the data set. a. Which split on x 1 will...
-
The accompanying data set contains two variables, Date1 and Date2. a. Create a new variable called DifferenceInYear that contains the difference between Date1 and Date2 in year for each observation....
-
The accompanying data file contains 19 observations with two variables, x 1 and x 2 . a. Using the original values, compute the Euclidean distance for all possible pairs of the first three...
-
Who was Phar-Mors flamboyant Chief Executive Officer?
-
Which of the following generally is not considered something of value? 1. Cash, money or checks 2. Airline miles or hotel credits associated with frequent activity (e.g., frequent flier miles) 3. An...
-
Which of the following is not one of the five major categories of fraudulent disbursements? 1. Payroll schemes 2. Expense reimbursement schemes 3. Shell company schemes 4. Billing schemes
Study smarter with the SolutionInn App