Question: Part I: Data Preprocessing 1. Provide a summary of the variables in expediatrain.sas7bdat (you can use StatExplore for this purpose). 2. Explore the statistical properties
Part I: Data Preprocessing 1. Provide a summary of the variables in expediatrain.sas7bdat (you can use StatExplore for this purpose). 2. Explore the statistical properties of the variables in the input data set. The results that are generated in this step will give you an idea of which variables are most useful in predicting the target response. 3. Check the Class Variable Summary Statistics and the Interval Variable Summary Statistics sections of the output. (a) Are there any missing values for any of the variables? Use imputation to fill in all missing data (describe how you did imputation in the report). 1 (b) Are there any variables with high variances? If yes, you should plot the data and explore transformations that can reduce the variances of these variables. Describe such activities in the report, if any. 4. Partition dataset expediatrain.sas7bdat into training(55%) and validation (45%) - i.e., 0% for testing.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
