Question: A data set containing wages and other information for a group of 3000 male workers in the Mid-Atlantic region is provided in the input file

A data set containing wages and other information for a group of 3000 male workers in the Mid-Atlantic region is provided in the input filewages.csv

Perform the following operations using Python on this data-

1. Load the data set from the input filewages.csv

2. Generate polynomial models ofagecolumn from this data set up to 4 degrees (i.e. create 4 polynomial models of degrees 1 to 4)

Hint:Use PolynomialFeatures().fit_transform to create these models

3. Perform linear regression as follows:

  • Perform linear regression using all these four models
  • Fit each model onwagecolumn of the data set
  • Use cross-validation with cv=5 to compute the scores for fitting of each of these models
  • Note:There will be 5 scores (since cv=5) for fitting of each model
  • Compute the mean score of fitting of each model
  • Printthe 4 mean scores in a file namedoutput.csv

Input Format:

Read data from a file namedwages.csvpresent at the locationres/wages.csv

Output Format:

  • You have to file namedoutput.csvat the locationoutput/output.csv
  • This file should contain the mean scores of fitting the 4 models on 4 separate rows
  • The values of mean scores need to be rounded to4 decimal placesand thenprintedsuch as0.2345

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!