Question: Question 5: Regression and Time Series (25 marks) In this question you will be working with dataset longley which can be view in R by

Question 5: Regression and Time Series (25 marks) In this question you will be working with dataset longley which can be view in R by running the code data(longley). The objective is to predict the number of people employed from economic variables. You'll need to conduct most of your anal- ysis in R and produce screenshot of the output together with suitable comments. The dataset has several numerical columns: GNP.deflator, GNP, Unemployed, Armed.Forces, Population. Year. Employed. The response variable is Em- ployed (i) Write down the formula for correlation coefficient for two random variables X and Y and use R to find pairwise correlation between the variables of longley. (5 marks) (ii) Create a multiple regression model regression Employed on all available vari- ables of the dataset. Produce a summary and ANOVA of the regression and comment on the significance of the parameter estimates. (5 marks) (iii) Hence using above produce a reduced model by selecting the most relevant ex- planatory variables and write the equation relating Employed to the relevant explanatory variables. (5 marks) (iv) State the assumptions underlying linear regression models. Produce the diag- nostic plots for your reduced model found in (iii) and comment on the validity of the model (5 marks) (v) Produce a time series plot of Employed against years and produce the auto- correlation plot or values
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
