Question: This is a real data set which was used in Leo Breiman and Jerome H. Friedman (1985), Estimating optimal transformations for multiple regression and correlation,

This is a real data set which was used in "Leo Breiman and Jerome H. Friedman (1985), Estimating optimal transformations for multiple regression and correlation, JASA, 80, pp. 580-598". The problem is to predict the daily maximum one-hour-average ozone reading in Los Angeles. The original data set can be found in R under mlbench package with the name "Ozone". There are 12 predictor variables in the original data set. I have removed the categorical variables and the entries with missing information which leaves you with 9 predictor variables and 203 observations.

YDaily maximum one-hour-average ozone reading

X1500 millibar pressure height (m) measured at Vandenberg AFB

X2Wind speed (mph) at Los Angeles International Airport (LAX)

X3Humidity (%) at LAX

X4Temperature (degrees F) measured at Sandburg, CA

X5Temperature (degrees F) measured at El Monte, CA

X6Inversion base height (feet) at LAX

X7Pressure gradient (mm Hg) from LAX to Daggett, CA

X8Inversion base temperature (degrees F) at LAX

X9Visibility (miles) measured at LAX

You are to come up with the best model that predicts the daily maximum one-hour-average ozone reading in Los Angeles using any combination of tools you have learned in this class so far. Consider the use of quadratic terms but no further. It is natural that people may come up with almost equally useful different models. You will be assessed on based on not an ultimate truth but rather how you approach the problem, how correctly you implement the covered methods and how you justify your actions.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!