Question: Lab 9B: Multiple Regression 1. Start by importing the data in the le 'Lab9UStemp.csv' to Rcmdr. These data comprise the latitude, longitude, and January temperature

Lab 9B: Multiple Regression 1. Start by importing the data in the le 'Lab9UStemp.csv' to Rcmdr. These data comprise the latitude, longitude, and January temperature for 56 U.S. cities. To make life simple, the variable LONG is the longitude rescaled to be between 0 (Portland OR, the westernmost of the cities) and 1 (Portland ME, the easternmost). Similarly LAT is the latitude having been rescaled from 0 (Key West FL) to 1 (Seattle WA). Thus a scatterplot of LAT versus LONG lets you visualize the geographic locations of these cities. Try this, and show the plot to TA, who will initial your answer sheet. 2. The JANTEMP variable is dened as an average of the minimum daily temperatures occurring in January over the course of 30 consecutive years, i.e., an average of the 30 31 daily minimum temperatures. Fit a multiple linear regression model with JANTEMP as the response variable and LAT and LONG as the explanatory variables (Statistics Fit models Linear regression...). On the hand-in sheet, record the estimated coecient for LAT, and give an interpretation for this estimate in the context of the physical problem being considered. 3. Now, let's focus in on the residuals of our regression t. The residual for the i-th city is the dierence between the city's actual outcome and what the regression model predicts, i.e., ei = JAN T EM Pi (0 + 1 LATi + 2 LON Gi ). If you like, the residuals e1 , . . . , en are estimates of the error terms, so that if the regression model is a good description of what is going on, the residuals should behave like 'random noise,' and we shouldn't see systematic patterns in the relationship between the residuals and aspects of the explanatory variables. You can get some feeling for the residuals via the default residual plots (Models Graphs Basic diagnostic plots), but don't worry too much about these. What I'd like you to focus on are two scatterplots: the residuals versus LAT and the residuals versus LONG. To get these in Rcmdr, you rst need to add the residuals to the dataset (Models Add observation statistics to data...). Then you can use the regular scatterplot procedure. On the hand-in sheet, identify something from one of these two scatterplots that suggests this regression model is not a good description of what is going on, again making reference to the physical geography involved in the problem. 4. Now, please t a dierent regression model for JANTEMP - this time with LAT, LONG, LONG2 and LONG3 as explanatory variables. To do this, you will have to add the squared and cubed longitude as variables in the dataset (e.g. for the squared term, Data Manage variable in active data set Compute new variable, use LONG2 as the 'expression to compute'). For this new model, again look at scatterplots of the residuals against LAT and LONG. On the hand-in sheet, comment briey on whether/how these plots suggest a better-tting model than before, and again whether this seems to make sense geographically. 5. As a nal task, if we locate Vancouver BC on our coordinate system, we nd ourselves sitting at (LAT=1.050, LONG=-0.0016). Similarly, Victoria BC sits at (LAT=1.014, LONG=-0.0026). On the hand-in sheet, given an estimate of typically how much milder Victoria is than Vancouver, in terms of minimum daily temperature in January. (In doing this, you are extrapolating beyond the range of the observed data, which can be dangerous. Fortunately, though, in this case you aren't extrapolating very far.) CITY JANTEMP LAT LONG MobileAL 6.7 0.27 0.66 MontgomeryAL 3.3 0.34 0.69 PhoenixAZ 1.7 0.37 0.2 LittleRockAR -0.6 0.45 0.58 LosAngelesCA 8.3 0.4 0.09 SanFranciscoCA 5.6 0.58 0 DenverCO -9.4 0.68 0.34 NewHavenCT -5.6 0.72 0.94 WilmingtonDE -3.3 0.67 0.89 WashingtonDC -1.1 0.64 0.87 JacksonvilleFL 7.2 0.26 0.78 KeyWestFL 18.3 0 0.78 MiamiFL 14.4 0.06 0.81 AtlantaGA 2.8 0.39 0.72 BoiseID -5.6 0.81 0.12 ChicagoIL -7.2 0.75 0.67 IndianapolisIN -6.1 0.64 0.69 DesMoinesIA -11.7 0.73 0.56 WichitaKS -5.6 0.57 0.49 LouisvilleKY -2.8 0.61 0.7 NewOrleansLA 7.2 0.25 0.63 PortlandME -11.1 0.83 1 BaltimoreMD -3.9 0.64 0.87 BostonMA -5 0.77 0.98 DetroitMI -6.1 0.78 0.75 MinneapolisMN -16.7 0.9 0.56 StLouisMO -4.4 0.62 0.62 HelenaMT -13.3 0.96 0.2 OmahaNE -10.6 0.73 0.51 ConcordNH -11.7 0.8 0.97 AtlanticCityNJ -2.8 0.64 0.91 AlbuquerqueNM -4.4 0.44 0.31 AlbanyNY -10 0.76 0.94 NewYorkNY -2.8 0.68 0.92 CharlotteNC 1.1 0.47 0.79 RaleighNC -0.6 0.49 0.84 BismarckND -17.8 0.96 0.42 CincinnatiOH -3.3 0.61 0.72 ClevelandOH -6.1 0.75 0.77 OklahomaCityOK -2.2 0.47 0.49 PortlandOR 0.6 0.89 0 HarrisburgPA -4.4 0.69 0.86 PhiladelphiaPA -4.4 0.69 0.91 CharlestonSC 3.3 0.36 0.8 NashvilleTN -0.6 0.51 0.68 AmarilloTX -4.4 0.46 0.4 GalvestonTX 9.4 0.19 0.53 HoustonTX 6.7 0.22 0.52 SaltLakeCityUT -7.8 0.7 0.21 BurlingtonVT -13.9 0.87 0.94 NorfolkVA 0 0.52 0.88 SeattleWA 0.6 1 0.01 SpokaneWA -7.2 1 0.1 MadisonWI -12.8 0.8 0.63 MilwaukeeWI -10.6 0.79 0.67 CheyenneWY -10 0.7 0.35 Lab 9B: Multiple Regression 1. Start by importing the data in the le 'Lab9UStemp.csv' to Rcmdr. These data comprise the latitude, longitude, and January temperature for 56 U.S. cities. To make life simple, the variable LONG is the longitude rescaled to be between 0 (Portland OR, the westernmost of the cities) and 1 (Portland ME, the easternmost). Similarly LAT is the latitude having been rescaled from 0 (Key West FL) to 1 (Seattle WA). Thus a scatterplot of LAT versus LONG lets you visualize the geographic locations of these cities. Try this, and show the plot to TA, who will initial your answer sheet. 2. The JANTEMP variable is dened as an average of the minimum daily temperatures occurring in January over the course of 30 consecutive years, i.e., an average of the 30 31 daily minimum temperatures. Fit a multiple linear regression model with JANTEMP as the response variable and LAT and LONG as the explanatory variables (Statistics Fit models Linear regression...). On the hand-in sheet, record the estimated coecient for LAT, and give an interpretation for this estimate in the context of the physical problem being considered. 3. Now, let's focus in on the residuals of our regression t. The residual for the i-th city is the dierence between the city's actual outcome and what the regression model predicts, i.e., ei = JAN T EM Pi (0 + 1 LATi + 2 LON Gi ). If you like, the residuals e1 , . . . , en are estimates of the error terms, so that if the regression model is a good description of what is going on, the residuals should behave like 'random noise,' and we shouldn't see systematic patterns in the relationship between the residuals and aspects of the explanatory variables. You can get some feeling for the residuals via the default residual plots (Models Graphs Basic diagnostic plots), but don't worry too much about these. What I'd like you to focus on are two scatterplots: the residuals versus LAT and the residuals versus LONG. To get these in Rcmdr, you rst need to add the residuals to the dataset (Models Add observation statistics to data...). Then you can use the regular scatterplot procedure. On the hand-in sheet, identify something from one of these two scatterplots that suggests this regression model is not a good description of what is going on, again making reference to the physical geography involved in the problem. 4. Now, please t a dierent regression model for JANTEMP - this time with LAT, LONG, LONG2 and LONG3 as explanatory variables. To do this, you will have to add the squared and cubed longitude as variables in the dataset (e.g. for the squared term, Data Manage variable in active data set Compute new variable, use LONG2 as the 'expression to compute'). For this new model, again look at scatterplots of the residuals against LAT and LONG. On the hand-in sheet, comment briey on whether/how these plots suggest a better-tting model than before, and again whether this seems to make sense geographically. 5. As a nal task, if we locate Vancouver BC on our coordinate system, we nd ourselves sitting at (LAT=1.050, LONG=-0.0016). Similarly, Victoria BC sits at (LAT=1.014, LONG=-0.0026). On the hand-in sheet, given an estimate of typically how much milder Victoria is than Vancouver, in terms of minimum daily temperature in January. (In doing this, you are extrapolating beyond the range of the observed data, which can be dangerous. Fortunately, though, in this case you aren't extrapolating very far.) Lab 9B: Multiple Regression 1. Start by importing the data in the le 'Lab9UStemp.csv' to Rcmdr. These data comprise the latitude, longitude, and January temperature for 56 U.S. cities. To make life simple, the variable LONG is the longitude rescaled to be between 0 (Portland OR, the westernmost of the cities) and 1 (Portland ME, the easternmost). Similarly LAT is the latitude having been rescaled from 0 (Key West FL) to 1 (Seattle WA). Thus a scatterplot of LAT versus LONG lets you visualize the geographic locations of these cities. Try this, and show the plot to TA, who will initial your answer sheet. 2. The JANTEMP variable is dened as an average of the minimum daily temperatures occurring in January over the course of 30 consecutive years, i.e., an average of the 30 31 daily minimum temperatures. Fit a multiple linear regression model with JANTEMP as the response variable and LAT and LONG as the explanatory variables (Statistics Fit models Lin- ear regression...). On the hand-in sheet, record the estimated coecient for LAT, and give an interpretation for this estimate in the context of the physical problem being considered. 3. Now, let's focus in on the residuals of our regression t. The residual for the i-th city is the dierence between the city's actual outcome and what the regression model predicts, i.e., ei = J AN T EM Pi (0 + 1 LATi + 2 LON Gi ). If you like, the residuals e1 , . . . , en are estimates of the error terms, so that if the regression model is a good description of what is going on, the residuals should behave like 'random noise,' and we shouldn't see systematic patterns in the relationship between the residuals and aspects of the explanatory variables. You can get some feeling for the residuals via the default residual plots (Models Graphs Basic diagnostic plots), but don't worry too much about these. What I'd like you to focus on are two scatterplots: the residuals versus LAT and the residuals versus LONG. To get these in Rcmdr, you rst need to add the residuals to the dataset (Models Add observation statistics to data...). Then you can use the regular scatterplot procedure. On the hand-in sheet, identify something from one of these two scatterplots that suggests this regression model is not a good description of what is going on, again making reference to the physical geography involved in the problem. 4. Now, please t a dierent regression model for JANTEMP - this time with LAT, LONG, LONG2 and LONG3 as explanatory variables. To do this, you will have to add the squared and cubed longitude as variables in the dataset (e.g. for the squared term, Data Manage variable in active data set Compute new variable, use LONG2 as the 'expression to compute'). For this new model, again look at scatterplots of the residuals against LAT and LONG. On the hand-in sheet, comment briey on whether/how these plots suggest a bettertting model than before, and again whether this seems to make sense geographically. 5. As a nal task, if we locate Vancouver BC on our coordinate system, we nd ourselves sitting at (LAT=1.050, LONG=-0.0016). Similarly, Victoria BC sits at (LAT=1.014, LONG=-0.0026). On the hand-in sheet, given an estimate of typically how much milder Victoria is than Vancouver, in terms of minimum daily temperature in January. (In doing this, you are extrapolating beyond the range of the observed data, which can be dangerous. Fortunately, though, in this case you aren't extrapolating very far.) CITY JANTEMP LAT LONG MobileAL 6.7 0.27 0.66 MontgomeryAL 3.3 0.34 0.69 PhoenixAZ 1.7 0.37 0.2 LittleRockAR -0.6 0.45 0.58 LosAngelesCA 8.3 0.4 0.09 SanFranciscoCA 5.6 0.58 0 DenverCO -9.4 0.68 0.34 NewHavenCT -5.6 0.72 0.94 WilmingtonDE -3.3 0.67 0.89 WashingtonDC -1.1 0.64 0.87 JacksonvilleFL 7.2 0.26 0.78 KeyWestFL 18.3 0 0.78 MiamiFL 14.4 0.06 0.81 AtlantaGA 2.8 0.39 0.72 BoiseID -5.6 0.81 0.12 ChicagoIL -7.2 0.75 0.67 IndianapolisIN -6.1 0.64 0.69 DesMoinesIA -11.7 0.73 0.56 WichitaKS -5.6 0.57 0.49 LouisvilleKY -2.8 0.61 0.7 NewOrleansLA 7.2 0.25 0.63 PortlandME -11.1 0.83 1 BaltimoreMD -3.9 0.64 0.87 BostonMA -5 0.77 0.98 DetroitMI -6.1 0.78 0.75 MinneapolisMN -16.7 0.9 0.56 StLouisMO -4.4 0.62 0.62 HelenaMT -13.3 0.96 0.2 OmahaNE -10.6 0.73 0.51 ConcordNH -11.7 0.8 0.97 AtlanticCityNJ -2.8 0.64 0.91 AlbuquerqueNM -4.4 0.44 0.31 AlbanyNY -10 0.76 0.94 NewYorkNY -2.8 0.68 0.92 CharlotteNC 1.1 0.47 0.79 RaleighNC -0.6 0.49 0.84 BismarckND -17.8 0.96 0.42 CincinnatiOH -3.3 0.61 0.72 ClevelandOH -6.1 0.75 0.77 OklahomaCityOK -2.2 0.47 0.49 PortlandOR 0.6 0.89 0 HarrisburgPA -4.4 0.69 0.86 PhiladelphiaPA -4.4 0.69 0.91 CharlestonSC 3.3 0.36 0.8 NashvilleTN -0.6 0.51 0.68 AmarilloTX -4.4 0.46 0.4 GalvestonTX 9.4 0.19 0.53 HoustonTX 6.7 0.22 0.52 SaltLakeCityUT -7.8 0.7 0.21 BurlingtonVT -13.9 0.87 0.94 NorfolkVA 0 0.52 0.88 SeattleWA 0.6 1 0.01 SpokaneWA -7.2 1 0.1 MadisonWI -12.8 0.8 0.63 MilwaukeeWI -10.6 0.79 0.67 CheyenneWY -10 0.7 0.35

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!