Question: PLEASE complete the code IN PYTHON (URGENT) decision tree should work for four cases: i) discrete features, discrete output ii) discrete features, real output; iii)
PLEASE complete the code IN PYTHON (URGENT)
decision tree should work for four cases: i) discrete features, discrete output
ii) discrete features, real output;
iii) real features, discrete output;
iv) real features, real output.
decision tree should be able to use GiniIndex or InformationGain as the criteria for splitting.code should also be able to plot/display the decision tree.
import numpy as np import pandas as pd import matplotlib.pyplot as plt from .utils import entropy, information_gain, gini_index
np.random.seed(42)
class DecisionTree(): def __init__(self, criterion, max_depth): """ Put all infromation to initialize your tree here. Inputs: > criterion : {"information_gain", "gini_index"} # criterion won't be used for regression > max_depth : The maximum depth the tree can grow to """ pass
def fit(self, X, y): """ Function to train and construct the decision tree Inputs: X: pd.DataFrame with rows as samples and columns as features (shape of X is N X P) where N is the number of samples and P is the number of columns. y: pd.Series with rows corresponding to output variable (shape of Y is N) """ pass
def predict(self, X): """ Funtion to run the decision tree on a data point Input: X: pd.DataFrame with rows as samples and columns as features Output: y: pd.Series with rows corresponding to output variable. THe output variable in a row is the prediction for sample in corresponding row in X. """ pass
def plot(self): """ Function to plot the tree Output Example: ?(X1 > 4) Y: ?(X2 > 7) Y: Class A N: Class B N: Class C Where Y => Yes and N => No """ pass
util - which need to be completed tooo
def entropy(Y): """ Function to calculate the entropy Inputs: > Y: pd.Series of Labels Outpus: > Returns the entropy as a float """ pass
def gini_index(Y): """ Function to calculate the gini index Inputs: > Y: pd.Series of Labels Outpus: > Returns the gini index as a float """ pass
def information_gain(Y, attr): """ Function to calculate the information gain Inputs: > Y: pd.Series of Labels > attr: pd.Series of attribute at which the gain should be calculated Outputs: > Return the information gain as a float """ pass
| """ | |
| The current code given is for the Assignment 1. | |
| You will be expected to use this to make trees for: | |
| > discrete input, discrete output | |
| > real input, real output | |
| > real input, discrete output | |
| > discrete input, real output | |
| """ | |
| import numpy as np | |
| import pandas as pd | |
| import matplotlib.pyplot as plt | |
| from tree.base import DecisionTree | |
| from metrics import * | |
| np.random.seed(42) | |
| # Test case 1 | |
| # Real Input and Real Output | |
| N = 30 | |
| P = 5 | |
| X = pd.DataFrame(np.random.randn(N, P)) | |
| y = pd.Series(np.random.randn(N)) | |
| for criteria in ['information_gain', 'gini_index']: | |
| tree = DecisionTree(criterion=criteria) #Split based on Inf. Gain | |
| tree.fit(X, y) | |
| y_hat = tree.predict(X) | |
| tree.plot() | |
| print('Criteria :', criteria) | |
| print('RMSE: ', rmse(y_hat, y)) | |
| print('MAE: ', mae(y_hat, y)) | |
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
