Question: CV by hand We'll start with a simulated example. The code chunk below imports data that is non-linear and shows increasing variance as the predictor

CV "by hand" We'll start with a simulated example. The code chunk below imports data that is non-linear and shows increasing variance as the predictor increases. I like to use this setting because "model complexity" is easiest for me to understand when I can see it. However, "model complexity" is also an issue when you're dealing with lots of predictors - you can't "see" overfitting as easily, but it definitely happens. data("lidar") lidar_df = lidar |> as_tibble() |> mutate(id = row_number()) lidar_df |> ggplot(aes(x = range, y = logratio)) + geom_point() I'll split this data into training and test sets (using anti_join!!), and replot showing the split. Our goal will be to use the training data (in black) to build candidate models, and then see how those models predict in the testing data (in red). train_df = sample_frac(lidar_df, size = .8) test_df = anti_join(lidar_df, train_df, by = "id") ggplot(train_df, aes(x = range, y = logratio)) + geom_point() + geom_point(data = test_df, color = "red")

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!