Can you please be detailed on code of each step Thank you You will start by examining the data in the dataset To get the most out of this lab, read the instructions and code before you run the cells Take time to experiment Start by importing the pandas package and setting some default display options import pandas as pdpd set option ( ' display max rows', 5 0 0 ) pd set option ( ' display max columns', 5 0 0 ) pd set option ( ' display width', 1 0 0 0 ) Next, load the dataset into a pandas DataFrame The data doesn't contain a header, so you will define those column names in a variable that's named col names to the attributes listed in the dataset description url imports 8 5 csv col names ' symboling ' , 'normalized losses','fuel type','aspiration','num of doors','body style','drive wheels','engine location','wheel base', 'length','width','height','curb weight','engine type','num of cylinders','engine size', 'fuel system','bore','stroke','compression ratio','horsepower','peak rpm ' , 'city mpg ' , 'highway mpg ' , 'price' df car pd read csv ( url , sep ' , ' , names col names , na values , header None ) First, to see the number of rows ( instances ) and columns ( features ) , you will use shape df car shape Next, examine the data by using the head method df car head ( 5 ) There are 2 5 columns Some of the columns have numerical values, but many of them contain text To display information about the columns, use the info method df car info ( ) To make it easier to view the dataset when you start encoding, drop the columns that you won't use df car columns df car df car 'aspiration', 'num of doors', 'drive wheels', 'num of cylinders' copy ( ) You now have four columns These columns all contain text values df car head ( ) Most machine learning algorithms require inputs that are numerical values The num of cylinders and num of doors features have an ordinal value You could convert the values of these features into their numerical counterparts However, aspiration and drive wheels don't have an ordinal value These features must be converted differently You will explore the ordinal features first In this step Start by getting the new column types from the DataFrame df car info ( ) First, determine what values the ordinal columns contain Starting with the num of doors feature, you can use value counts to discover the values df car ' num of doors' value counts ( ) This feature only has two values four and two You can create a simple mapper that contains a dictionary door mapper two 2 , four 4 You can then use the replace method from pandas to generate a new numerical column based on the num of doors column df car ' doors ' df car num of doors replace ( door mapper ) When you display the DataFrame, you should see the new column on the right It contains a numerical representation of the number of doors df car head ( ) Repeat the process with the num of cylinders column First, get the values df car ' num of cylinders' value counts ( ) Next, create the mapper cylinder mapper two 2 , three 3 , four 4 , five 5 , six 6 , eight 8 , twelve 1 2 Apply the mapper by using the replace method df car ' cylinders ' df car ' num of cylinders' replace ( cylinder mapper ) df car head ( ) For more information about the replace method, see pandas DataFrame replace in the pandas documentation In this step, you will encode non ordinal data by using the get dummies method from pandas The two remaining features are not ordinal According to the attribute description, the following values are possible aspiration std , turbo drive wheels 4 wd , fwd , rwd You might think that the correct strategy is to convert these values into numerical values For example, consider the drive wheels feature You could use 4 wd 1 , fwd 2 , and rwd 3 However, fwd isn't less than rwd These values don't have an order, but you just introduced an order to them by assigning these numerical values The correct strategy is to convert these values into binary features for each value in the original feature This process is often called one hot encoding in machine learning, or dummying in statistics pandas provides a get dummies method, which converts the data into binary features For more information, see pandas get dummies in the pandas documentation According to the attribute description, drive wheels has three possible values df car ' drive wheels' value counts ( ) Use the get dummies method to add new binary features to the DataFrame df car pd get dummies ( df car,columns ' drive wheels' ) df car head ( ) When you examine the dataset, you should see three new columns on the right drive wheels 4 wd drive wheels fwd drive wheels rwd

The Answer is in the image, click to view ...

Question: Can you please be detailed on code of each step. Thank you You will start by examining the data in the dataset. To get the

Can you please be detailed on code of each step. Thank you

You will start by examining the data in the dataset.

To get the most out of this lab, read the instructions and code before you run the cells. Take time to experiment!

Start by importing the pandas package and setting some default display options.

import pandas as pdpd

.

set

_

option

('

display

.

max

_

rows',

500)

.

set

_

option

('

display

.

max

_

columns',

500)

.

set

_

option

('

display

.

width',

1000)

Next, load the dataset into a pandas DataFrame.

The data doesn't contain a header, so you will define those column names in a variable that's named col

_

names to the attributes listed in the dataset description.

url

=

"imports

- 85 .

csv

"

col

_

names

= ['

symboling

',

'normalized

-

losses','fuel

-

type','aspiration','num

-

-

doors','body

-

style','drive

-

wheels','engine

-

location','wheel

-

base',

'length','width','height','curb

-

weight','engine

-

type','num

-

-

cylinders','engine

-

size',

'fuel

-

system','bore','stroke','compression

-

ratio','horsepower','peak

-

rpm

',

'city

-

mpg

',

'highway

-

mpg

',

'price'

]

_

car

=

.

read

_

csv

(

url

,

sep

=',',

names

=

col

_

names

,

_

values

= " ? ",

header

=

None

)

First, to see the number of rows

(

instances

)

and columns

(

features

),

you will use shape.

_

car.shape

Next, examine the data by using the head method.

_

car.head

(5)

There are

25

columns. Some of the columns have numerical values, but many of them contain text.

To display information about the columns, use the info method.

_

car.info

()

To make it easier to view the dataset when you start encoding, drop the columns that you won't use.

_

car.columns

_

car

=

_

car

[[

'aspiration', 'num

-

-

doors', 'drive

-

wheels', 'num

-

-

cylinders'

]] .

copy

()

You now have four columns. These columns all contain text values.

_

car.head

()

Most machine learning algorithms require inputs that are numerical values.

The num

-

-

cylinders and num

-

-

doors features have an ordinal value. You could convert the values of these features into their numerical counterparts.

However, aspiration and drive

-

wheels don't have an ordinal value. These features must be converted differently.

You will explore the ordinal features first.

In this step

Start by getting the new column types from the DataFrame:

_

car.info

()

First, determine what values the ordinal columns contain.

Starting with the num

-

-

doors feature, you can use value

_

counts to discover the values.

_

car

['

num

-

-

doors'

] .

value

_

counts

()

This feature only has two values: four and two. You can create a simple mapper that contains a dictionary:

door

_

mapper

= {"

two

"

2,

"four":

4}

You can then use the replace method from pandas to generate a new numerical column based on the num

-

-

doors column.

_

car

['

doors

'] =

_

car

["

num

-

-

doors"

] .

replace

(

door

_

mapper

)

When you display the DataFrame, you should see the new column on the right. It contains a numerical representation of the number of doors.

_

car.head

()

Repeat the process with the num

-

-

cylinders column.

First, get the values.

_

car

['

num

-

-

cylinders'

] .

value

_

counts

()

Next, create the mapper.

cylinder

_

mapper

= {"

two

"

2,

"three":

3,

"four":

4,

"five":

5,

"six":

6,

"eight":

8,

"twelve":

12}

Apply the mapper by using the replace method.

_

car

['

cylinders

'] =

_

car

['

num

-

-

cylinders'

] .

replace

(

cylinder

_

mapper

)

_

car.head

()

For more information about the replace method, see pandas.DataFrame.replace in the pandas documentation.

In this step, you will encode non

-

ordinal data by using the get

_

dummies method from pandas.

The two remaining features are not ordinal.

According to the attribute description, the following values are possible:

aspiration: std

,

turbo.

drive

-

wheels:

4

,

fwd

,

rwd

.

You might think that the correct strategy is to convert these values into numerical values. For example, consider the drive

-

wheels feature. You could use

4

= 1,

fwd

= 2,

and rwd

= 3 .

However, fwd isn't less

than rwd

.

These values don't have an order, but you just introduced an order to them by assigning these numerical values.

The correct strategy is to convert these values into binary features for each value in the original feature. This process is often called one

-

hot encoding in machine learning, or dummying in statistics.

pandas provides a get

_

dummies method, which converts the data into binary features. For more information, see pandas.get

_

dummies in the pandas documentation.

According to the attribute description, drive

-

wheels has three possible values.

_

car

['

drive

-

wheels'

] .

value

_

counts

()

Use the get

_

dummies method to add new binary features to the DataFrame.

_

car

=

.

get

_

dummies

(

_

car,columns

= ['

drive

-

wheels'

])

_

car.head

()

When you examine the dataset, you should see three new columns on the right:

drive

-

wheels

_4

drive

-

wheels

_

fwd

drive

-

wheels

_

rwd

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

1. Recessions are periods of a. rising incomes. b. falling incomes. c. rising prices. d. falling prices.

need help with these two labs Hands-On Projects Project 1-1: Examining Data BreachesTextual The Privacy Rights Clearinghouse (PRC) is a nonprofit organization whose goals are to raise consumers'...

I need help with Visual Basic code. I have attached file that i have done so far. I am only getting one error when i am trying add new employee. You can download my Zip folder to Run the whole...

Follow the steps given in Machine Learning With R , Chapter 5, section "Example Identifying Risky Bank Loans Using C5.0 Decision Trees." download the credit. csv file from Packt Publishing's website...

Please complete and upload all 4 files for a full rating. Thank you. Homework 7 Basic Introduction to Coding and R Script 10 Points This homework is due four (4) days after your registered lab...

LEAN QUIZ This quiz is based upon: Staats, B. R., & Upton, D. M. (2011). Lean knowledge work. Harvard Business Review, (October), Complete 5 of 7 questions correctly for maximum of 5 points. Which is...

USE PYTHON 3.9 . ONLY NEED TO IMPLEMENT 1 METHOD(ignore the first one): IMPORTANT !!! MAKE COMMENTS AND PROVIDE A DETAILED DOCUMENTATION OF THE METHOD IMPLEMENTED!!! THANK YOU Task: Reading and...

please help complete number one and show why its fraud in bulletin points and calculations in excel sheet please. ISSUES IN ACCOUNTING EDUCATION Vol. 19, No. 4 November 2004 pp. 505-527 Interstate...

I have attempted this on my own and am struggling quite a lot at this point. I need to finish the code by adding the three functions below: struct node * insert(struct node * start, int dataAfter,...

I have a lab due tomorrow and am very confused, help on everything would be much appreciated, apologies for the huge quantity of stuff. Thanks ENCMP 100-Computer Programming for Engineers Page 1 of 7...

Only need the excel section completed. I will do the written report stuff. Model is the format I want it complete in. Thanks Date 6/1/2010 7/1/2010 8/1/2010 9/1/2010 10/1/2010 11/1/2010 12/1/2010...

Atlantic Manufacturing Company uses process costing. 50% of the materials are added at the beginning of the process with the remaining materials added at the midpoint of the process. Atlantic has...

Kim was single on December 31,2021. Her husband, Lee, passed away on march 20, 2018, and she has not remarried. Rebecca and Doug have always filed married filing jointly in previous tax years. Kim...

Healthy weight women should gain approximately _ _ pounds in the third trimester . 2 pounds 3 0 pounds 1 4 pounds 2 2 pounds

CONCEPTS DE RADAR NOUVEAUX COR Lesturgic Marcicsure Director International Afrairs Directerate, ONERA Tel: +(33180386743)/(+)33671589717 Proferoor, Centralesupelec, SEE Fellow IEEE SM Partie 4 Dans le