Question: Part Three: Implement cross _ validation [ Graded ] Use grid _ search to implement the cross _ validation function, which takes in the training

Part Three: Implement cross_validation [Graded]
Use grid_search to implement the cross_validation function, which takes in the training set xTr, yTr, a list of depth candidates depths and performs k-Fold Cross Validation on the training set.
We will use generate_kFold to generate the k training/validation splits and pass in the indices for the splits to the cross_validation function. Therefore, for each (training_indices, validation_indices) element in indices, you need to perform grid search to find the training and validation loss for each depth for that fold. Finally, take the average training and validation loss across folds to get the "average" loss. Your implementation should return these 2 loss vectors and the depth with the minimum average validation loss.
I've already written generate_kFold and grid_search, but I'm having trouble piecing them together in code for cross_validation. Here's what I have for the first two functions:
def generate_kFold(n, k):
"""
Generates [(training_indices, validation_indices),...] for k-fold validation.
Input:
n: number of training examples
k: number of folds
Output:
kfold_indices: a list of length k. Each entry takes the form (training indices, validation indices)
"""
assert k >=2
kfold_indices =[]
# YOUR CODE HERE
indices = np.arange(n)
parts = np.array_split(indices, k)
for i in range(k):
validation_indices = parts[i]
training_indices = np.concatenate(parts[:i]+ parts[i+1:])
kfold_indices.append((list(training_indices), list(validation_indices)))
return kfold_indices
def grid_search(xTr, yTr, xVal, yVal, depths):
"""
Calculates the training and validation loss for trees trained on xTr and validated on yTr with a number of depths.
Input:
xTr: nxd training data matrix
yTr: n-dimensional vector of training labels
xVal: mxd validation data matrix
yVal: m-dimensional vector of validation labels
depths: a list of len k of depths
Output:
best_depth, training_losses, validation_losses
best_depth: the depth that yields that lowest validation loss
training_losses: a list of len k. the i-th entry corresponds to the the training loss of the tree of depth=depths[i]
validation_losses: a list of len k. the i-th entry corresponds to the the validation loss of the tree of depth=depths[i]
"""
training_losses =[]
validation_losses =[]
best_depth = None
# YOUR CODE HERE
best_loss = float('inf')
for depth in depths:
tree = RegressionTree(depth=depth)
tree.fit(xTr, yTr)
training_loss = square_loss(yTr, tree.predict(xTr))
validation_loss = square_loss(yVal, tree.predict(xVal))
training_losses.append(training_loss)
validation_losses.append(validation_loss)
if validation_loss < best_loss:
best_loss = validation_loss
best_depth = depth
return best_depth, training_losses, validation_losses

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!