1. Assume we have a training set and a test set as follows: Training set Index...
Fantastic news! We've Found the answer you've been seeking!
Question:
Transcribed Image Text:
1. Assume we have a training set and a test set as follows: Training set Index y Test set I Index I y 1 5.86 0.74 1 5.80 0.93 2 1.34 1.18 2 0.57 1.87 3 3.65 0.51 3 4.30 -0.06 4 4.69 -0.48 4 6.55 1.60 5 4.13 -0.07 5 0.82 1.22 6 5.87 0.37 6 3.72 0.90 7 7.91 1.35 7 5.80 0.93 8 5.57 0.30 8 3.26 1.53 9 7.30 1.64 9 6.75 1.73 10 7.89 1.75 10 4.77 -0.51 Let's try to find some linear models to fit the training data. We use the RSS objective function J(0) = ||f(x) - Y||- (a) Calculate the analytic solution of the linear model f(x) = 0 + 0₁x. Then, plot the line of your obtained linear model together with the data points in training set. (5%) (b) Suppose we want to increase the model complexity, by considering y as a linear function of both x and x²: f(x) = 0 + 0₁x+02x². In this case, calculate the analytic solution of the model and plot the curve of the model, together with the data points in training set. (5%) x(1) (hint: in this case, X = : x(1)2 θο and =₁ 1 x(10) x(10) 2 Page 1 of 5 (c) Let's further increase the model complexity, by assuming y is related to higher-order forms of x, ie., fe(x)=80+0₁×+ 0²x² +03x² + 84x4. Again, calculate the analytic solution of the model, plot the curve of the function, together with the data points in training set. (5%) 1 (hint: in this case, X = : x(1) (2 x(1)² X(1)³ X(1)⭑ 100\ 01 1 x (10) x(10) 2 x(10)4 and = 82) x(10) 83 04 (d) Observe the above three functions, please point out which could be faced with underfitting, which could be faced with overfitting, and which one is relatively a good one? Then, you can calculate the values of prediction error on the test data to verify your thoughts. (10%) 1. Assume we have a training set and a test set as follows: Training set Index y Test set I Index I y 1 5.86 0.74 1 5.80 0.93 2 1.34 1.18 2 0.57 1.87 3 3.65 0.51 3 4.30 -0.06 4 4.69 -0.48 4 6.55 1.60 5 4.13 -0.07 5 0.82 1.22 6 5.87 0.37 6 3.72 0.90 7 7.91 1.35 7 5.80 0.93 8 5.57 0.30 8 3.26 1.53 9 7.30 1.64 9 6.75 1.73 10 7.89 1.75 10 4.77 -0.51 Let's try to find some linear models to fit the training data. We use the RSS objective function J(0) = ||f(x) - Y||- (a) Calculate the analytic solution of the linear model f(x) = 0 + 0₁x. Then, plot the line of your obtained linear model together with the data points in training set. (5%) (b) Suppose we want to increase the model complexity, by considering y as a linear function of both x and x²: f(x) = 0 + 0₁x+02x². In this case, calculate the analytic solution of the model and plot the curve of the model, together with the data points in training set. (5%) x(1) (hint: in this case, X = : x(1)2 θο and =₁ 1 x(10) x(10) 2 Page 1 of 5 (c) Let's further increase the model complexity, by assuming y is related to higher-order forms of x, ie., fe(x)=80+0₁×+ 0²x² +03x² + 84x4. Again, calculate the analytic solution of the model, plot the curve of the function, together with the data points in training set. (5%) 1 (hint: in this case, X = : x(1) (2 x(1)² X(1)³ X(1)⭑ 100\ 01 1 x (10) x(10) 2 x(10)4 and = 82) x(10) 83 04 (d) Observe the above three functions, please point out which could be faced with underfitting, which could be faced with overfitting, and which one is relatively a good one? Then, you can calculate the values of prediction error on the test data to verify your thoughts. (10%)
Expert Answer:
Answer rating: 100% (QA)
import numpy as np Test set data Xtest nparray580 057 430 655 082 372 580 326 675 477 Ytest ... View the full answer
Related Book For
Operating Systems Internals and Design Principles
ISBN: 978-0133805918
8th edition
Authors: William Stallings
Posted Date:
Students also viewed these programming questions
-
You are asked to develop a Floppy Disk program that allows users to access a floppy disk locally mounted on a computer. You are expected to use C programming language. In your program, all file I/O...
-
A survey of information systems managers was used to predict the yearly salary of beginning programmer/analysts in a metropolitan area. Managers specified their standard salary for a beginning...
-
How can we use these theories to analyze factors which influence the longevity and adaptability of these organizations in changing landscapes?
-
Consider an organization you were/are currently involved with. After providing a short overview of your organization, reflect on the Seven Levels of an Ethical Organization, found in the Gebler...
-
Suppose that the input piston of a hydraulic jack has a diameter of 2 cm and the load piston a diameter of 25 cm. The jack is being used to lift a car with a mass of 1400 kg. a. What are the areas of...
-
In July 2017, Latrice Merritt entered a residential lease with Doran 610 Apartments, LLC. Under the terms of the lease agreement, Merritt was prohibited from installing a private security system in...
-
On December 31, 2016, Akron, Inc. purchased 5 Percent of Zip Company's common shares on the open market in exchange for $16,000. On December 31, 2017, Akron, Inc., acquires an additional 25 percent...
-
Is discord supported by a relational database, hierarchical database, NoSQL database, or something else?Explain
-
A developer has a rectangular 5-acre site with a commercial zoning designation allowing mini-storage, office, retail, or apartments. The land is under contract for purchase for $3.35 million and the...
-
Press Inc. reported the following information in its annual report for 2010. Cash flows from operating activities $295,000 Capital Expenditures 110,000 Proceeds from disposals of property, plant and...
-
Javier Martinez's (63) filing status is Single. His Form SSA-1099 is shown. In addition he received the following income: Taxable pension $19,217, wages from a part-time job $4,773, for a total of...
-
A rectangular box is h = 19.5 inches in high, w = 29.5 cm wide, and L = 1.3 yards long. What is the volume of the box in cubic feet? V=
-
Account Balance as of December 31 220210 Production Machinery, Equip & Fixtures (Dir.Post) 915,000 220310 Accumulated Depreciation-Machinery (Direct Post) 305,000 CSI Barcode system and installation...
-
Organizations often have two portfolios in the heavy construction machinery and mining equipment industry: an explore portfolio and an exploit portfolio. The explore portfolio focuses on innovative...
-
As the engineering manager, what steps would you take to collect relevant financial data and prepare the operational budget for your division? Explain how you would ensure accuracy and completeness ?
-
The National Health Insurance Scheme (NHIS) is financed primarily by tax revenue, and claims make up the bulk of its expenditures, The NHI levy provides seventy four percent (74%) of NHIS revenue,...
-
Based on the scenario described below, generate all possible association rules with values for confidence, support (for dependent), and lift. Submit your solutions in a Word document (name it...
-
List and briefly define four different clustering methods.
-
Why does Figure have two blocked states? ew Activate Dispatch Read Release Read Runnin Exit Sus Suspend Time-out Activate oc Blocked sus Suspend
-
Consider the following sequence of page references (each element in the sequence represents a page number): Define the mean working set size after the kth reference as And define the missing page...
-
When marginal cost equals the minimum point of average variable cost, what is true about the average productivity and marginal productivity of workers?
-
What determines the distance between the average total cost and the average variable cost?
-
If average productivity falls, will marginal cost necessarily rise? How about average cost?
Study smarter with the SolutionInn App