Question: Q1: (30 points) Consider the dataset 'stackloss' which is embedded in H. You may load it via typing 'stackloss' in the R console. The dataset




Q1: (30 points) Consider the dataset 'stackloss' which is embedded in H. You may load it via typing 'stackloss' in the R console. The dataset has 4 variables and they are as follows: 1. Air Flow - Flow of cooling air 2. Water Temp - Cooling Water Inlet Temperature 3. Acid Cone. - Concentration of acid [per 10001 minus 500] 4. stackioss - Stack loss The variables are described in the R help-pages as follows: Obtained from 21 days of operation of a plant for the oxidation of ammonia {NH3} to nitric acid (HNO3). The nitric oxides produced are absorbed in a cormtercurrent absorption tower. Air Flow represents the rate of operation of the plant. Water Temp is the temperature of cooling water circulated through coils in the absorption tower- Acid Cons. is the concentration of the acid circulating, minus 50, times 10: that is, 89 corresponds to 58.9 per cent acid- stackioss (the dependent variable} is 10 times the percentage of the ingoing ammonia to the plant that escapes from the absorption column unabsorbed; that is, an (inverse) measure of the over-all efciency of the plant. Based on this dataset, perform an analysis in a statistical programming language of your choice and perform the following tasks: 1. Derive the basic descriptive statistics of the independent and dependent variable /s. This refers to the average, median, quantiles, variance. Comment on the results. 2. Perform a linear regression (OLS) on the dataset with stackloss as the dependent variable and the other three as independent variables. Analyse the results by commenting on the magnitude of the parameters, their sign, pvalues, F-test and R2. 3. In the 0L3, would you include an intercept? Why or why not
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
