Question: Please provide the coding necessary: We will be using the Forest Fires dataset. The CSV ( forest _ fires.csv ) file is available in the
Please provide the coding necessary:
We will be using the Forest Fires dataset. The CSV forestfires.csvfile is available in the Google Drive datasets folder. Information about the dataset is available on the UCI Machine Learning Repository. This way you can see what each of the columns represents.
Exercise :
Perform some exploratory data analysis on the dataset. What variablesare the most important when determining the target variable areaAre there any data points you would remove as outliers?
Exercise :
Does there appear to be any relationship between month or day of week and area? Should you keep those variables? Keep in mind that these categorical variables must become binary.
Exercise :
For the rest of the assignment, drop the XYmonth and day columns from your dataset. Create a linear regression model with all remaining variables. How does it perform?
Exercise :
Based upon your analysis earlier, create a linear regression model on only the most salient features.
Exercise :
Create a LASSO regression model using all relevant variables. Choose an alpha such that at most features have nonzero coefficients. How do these features compare to those found for exercise How does the model perform?
Exercise :
Since the vast majority of the data has area of Sowe may turn this into a classification problem. Turn the target variable into a binary variable, such that it is either or more than Create at least two different linear classification models. Which one performs better? How does their performance compare with the regression models you created above?
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
