Question: R programming 1) Start by casting column 2 so that it is interpreted by R as a factor instead of an integer value. 2) Partition
R programming
1) Start by casting column 2 so that it is interpreted by R as a factor instead of an integer value.
2) Partition the set of 200 instances/observations from the previous step into a new test set comprised of 1/3 of the rows and a new training set comprised of 2/3 of the rows. We want to randomize the selection of rows into each partition. Call the test set test_set, and the training set train_set. Use the following steps to accomplish this:
a) Use set.seed(555) to set the seed for the random number generator. If you do not do this you will be massively penalized since your result will be hard to verify.
b) Create a vector called index with indices for all observations. Since there are 200 rows, your vector index should contains the values 1 to 200.
c) Next create a vector of the indices for the test_set by randomly sampling 1/3 of the indices in index using the method sample(). This method takes two arguments. The first argument is the vector index and the second argument is the number of samples you want to randomly select. You should be selecting 1/3, i.e., 66.
d) Using the vector of test_set indices you created in step 2c, extract the test set, assigning it to test_set.
e) Select the remaining 134 rows as the train_set
Can't figure out how to extract from the dataset and use everything but test_set for train_set
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
