Question: Given a dataset D= {(Ii, yi), i = 1, 2, ...,N} where both X; and yi are scalars. We define the line as y =

Given a dataset D= {(Ii, yi), i = 1, 2, ...,N}

where both X; and yi are scalars. We define the line as

Given a dataset D= {(Ii, yi), i = 1, 2, ...,N} where both X; and yi are scalars. We define the line as y = Bo + B,x. Thus, in order to find the best 'line', we only need to find the best Bo and B, that explain our data well. To calculate the best B1, which is denoted by B1, we use Bi 2 (zi - X)(1 ) (1) -X) where X = avg(X) = and Y = avg(Y) = N =1 Yi (2) N To calculate the best Bo, which is denoted by Bo, we use Bo = Y - X (3) Simple dataset: D = [(5, 43.1), (7.1, 32.1), (34.5, 40.3), (13, 39.3), (1.5, 47.7)] Xnew = [1, 2, 3, 4, 5) 1. Write a function called train to find the best Bo and B, by using the simple dataset D above . (15 pts) 2. Write a function called predict to make predictions on the Xnew above. (10 pts) 3. Write a main function to call the two functions defined above. Print out the parameters Bo and B, as well as the predictions (a vector new) for Xnew. (15 pts) 4. Remember to use the 'three-file structure to arrange all your codes. (10 pts) The second part of the code is as follows. Now, you are given a file 'data.csv'. (20 pts) 1. You should not change your predefined two functions. 2. You just need to update your main function to read data from file. 3. Then, you need to write the outputs (model parameters o and B1, as well as new) to a file output.txt' or 'output.csv'. The new is the predictions for the r in the test.csv' file. 4. More about the data.csv' file: the first column is the r and the second column is y. 5. More about the 'test.csv' file: there is only one column, which is the x values that needed to be predicted. Given a dataset D= {(Ii, yi), i = 1, 2, ...,N} where both X; and yi are scalars. We define the line as y = Bo + B,x. Thus, in order to find the best 'line', we only need to find the best Bo and B, that explain our data well. To calculate the best B1, which is denoted by B1, we use Bi 2 (zi - X)(1 ) (1) -X) where X = avg(X) = and Y = avg(Y) = N =1 Yi (2) N To calculate the best Bo, which is denoted by Bo, we use Bo = Y - X (3) Simple dataset: D = [(5, 43.1), (7.1, 32.1), (34.5, 40.3), (13, 39.3), (1.5, 47.7)] Xnew = [1, 2, 3, 4, 5) 1. Write a function called train to find the best Bo and B, by using the simple dataset D above . (15 pts) 2. Write a function called predict to make predictions on the Xnew above. (10 pts) 3. Write a main function to call the two functions defined above. Print out the parameters Bo and B, as well as the predictions (a vector new) for Xnew. (15 pts) 4. Remember to use the 'three-file structure to arrange all your codes. (10 pts) The second part of the code is as follows. Now, you are given a file 'data.csv'. (20 pts) 1. You should not change your predefined two functions. 2. You just need to update your main function to read data from file. 3. Then, you need to write the outputs (model parameters o and B1, as well as new) to a file output.txt' or 'output.csv'. The new is the predictions for the r in the test.csv' file. 4. More about the data.csv' file: the first column is the r and the second column is y. 5. More about the 'test.csv' file: there is only one column, which is the x values that needed to be predicted

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Linear regression coding problem. USE C++ Given a dataset D= {(Ii, yi), i = 1, 2, ...,N} where both X; and yi are scalars. We define the line as y = Bo + B,x. Thus, in order to find the best 'line',...

PLEASE SOLVE THIS LINEAR REGRESSION IN C++ This is a class on algorithm, and hence we should not bother with deriving the equations. Here is a short summary on how to find the 'best' line given a...

Scatterplot of x and y N . + 2.0 2.5 3.0 3.5 4.0 4.5 5.0 X The scatterplot of two numerical variables x and y shows a generally linear relationship, on average, with a positive slope. A simple linear...

Hello! I need help solving the following question. I'll include a screenshot at the end and the text here in case it doesn't upload. 1.Consider fitting a simple linear regression model, Yi = beta0 +...

Please see attached mathematical statistics question below.All available and sufficient information is provided below: Please help solve part(e) How to find c1, c2 and obtain confidence interval (CI)...

Submitted to Management Science manuscript MS-0001-1922.65 Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title....

Q9some type is wrong, can refer to the attach file. In a study of the relationship between an explanatory variable Z and a response variable Y , n independent observations are taken in the form of...

,.kindly solve please 1. Consider the simple regression model: V/i = Po+ Piri +ui, for i = 1, ..., n, with E(ur,) 7 0 and let z be a dummy instrumental variable for a, such that we can write: with...

Part II: Production/Cost Function Consider a population of firms belonging to a certain industry. A firm with capital K, labor L, and fuel F produces output Y given by the Cobb-Douglas production...

The Harter-Heighway Dragon Curve is a particularly interesting curve drawn in the x-y plane using an iterative generation approach. In this assignment, you will write a Python program that will draw...

Your Store Manager has had a stressful month. It is the end of the financial year and sales revenue in the Timber and Plumbing & Electrical departments is down on targets, almost certainly due to...

Determine the maximum 30-year fixed-rate mortgage amount for which a couple could qualify if the rate is 4.5 percent. Assume they have other debt payments totaling $500 per month and a combined...

Requirement Determine the weighted - average number of common shares outstanding for the year.Number of Shares OutstandingWeight by Number of Months Shares Are OutstandingWeighted - Average Shares...

Ortiz Co, had the following account balances: Sales revenue $ 440.000 Cost of goods sold 220,000 Salaries and wages expense 30,000 Prepaid insurance expense 60,000 Dividend revenue 12,000 Utilities...

=+1 Who has responsibility for this situation? What is the nature of various stakeholders responsibilities? Headquarters HR? Japan HR? Fred Bailey? Jennifer Bailey? Dave Steiner, the company managing...

=+5 Does this case provide an example of the future for IHRM?

=+4 How did it affect HR?