Question: please answer all the question shows below. The file softdrinkdata.csv contains three columns. The first, time, is how long it took a person to refill

please answer all the question shows below.

please answer all the question shows below. The file softdrinkdata.csv contains three

The file softdrinkdata.csv contains three columns. The first, time, is how long it took a person to refill a vending machine, in minutes. The second column, cases, is the number of cases of soft drinks, and finally, distance is the distance in feet that the worker had to walk to carry the cases from the truck to the vending machine. We might expect that lots of cases and a large distance will imply a large time. Let's fit a model of the form (time) = a (cases) +b x (distance) +c+ (noise) (1) to the data? (a) [2 marks] Load the data using read.csv() to obtain a data frame called data. For simplicity, extract its individual columns as vectors: time = data$time cases = data$cases distance = data$distance (b) [3 marks) Write a function called resid() that takes potential values for the unknown quantities a, b, and c passed to function as a single named vector. The function should compute and return the sum of squared residuals. Within the scope of the function, treat the data vectors as global variables. The sum of squared residuals is: S=[(time); [a(cases)i + b(distance)i + c]]?. (2) (c) [5 marks] Use R's optim() function to find the values of a, b, and c that minimise S. Write a sentence or two describing the results and their interpretation in plain English. (d) [2 marks] Make a different function, called absresid(), that uses the sum of absolute residuals instead: S= |(time); [a(cases)i + b(distance); +c| (3) i=1 (e) [3 marks] Use R's optim() function to find the values of a, b, and c that minimise S'. In practical terms, is there much difference between this result and the result in (b)? N The file softdrinkdata.csv contains three columns. The first, time, is how long it took a person to refill a vending machine, in minutes. The second column, cases, is the number of cases of soft drinks, and finally, distance is the distance in feet that the worker had to walk to carry the cases from the truck to the vending machine. We might expect that lots of cases and a large distance will imply a large time. Let's fit a model of the form (time) = a (cases) +b x (distance) +c+ (noise) (1) to the data? (a) [2 marks] Load the data using read.csv() to obtain a data frame called data. For simplicity, extract its individual columns as vectors: time = data$time cases = data$cases distance = data$distance (b) [3 marks) Write a function called resid() that takes potential values for the unknown quantities a, b, and c passed to function as a single named vector. The function should compute and return the sum of squared residuals. Within the scope of the function, treat the data vectors as global variables. The sum of squared residuals is: S=[(time); [a(cases)i + b(distance)i + c]]?. (2) (c) [5 marks] Use R's optim() function to find the values of a, b, and c that minimise S. Write a sentence or two describing the results and their interpretation in plain English. (d) [2 marks] Make a different function, called absresid(), that uses the sum of absolute residuals instead: S= |(time); [a(cases)i + b(distance); +c| (3) i=1 (e) [3 marks] Use R's optim() function to find the values of a, b, and c that minimise S'. In practical terms, is there much difference between this result and the result in (b)? N

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!