Question: Question 1 : Import data The url for the data is: https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-ST0151EN-SkillsNetwork/labs/boston_housing.csv 1. Please import the data and define&transform it into a dataframe (5 points)

Question 1: Import data

The url for the data is: https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-ST0151EN-SkillsNetwork/labs/boston_housing.csv

1. Please import the data and define&transform it into a dataframe (5 points)

###Question 2: View on the data

1. Show the first 10 and last 10 rows of the data (5 points)

2. Give the summary table of the data (5 points)

3. Find the number of rows and columns of the dataframe (5 points)

4. Drop the Unnamed: 0 column (5 points)

5. Sort the table by crime rate with ascending order (5 points)

6. Find the number of houses near the Charles River (5 points)

7. Extract per capita crime rate by town and pupil-teacher ratio by town and construct a new daraframe (5 points)

Question 3: Generate Descriptive Statistics and Visualizations

1. Find Interquantile range of houses' weighted distances to five Boston employment centres (5 points)

2. For the 'Median value of owner-occupied homes' provide a boxplot (5 points)

3. Provide a histogram for the index of accessibility to radial highways variable and state what the histogram has been indicating (10 points)

####4. Provide a scatter plot to show the relationship between Nitric oxide concentrations and the proportion of non-retail business acres per town. What can you say about the relationship? (10 points)

5. Discretize the age variable into three groups: 35 years and younger, between 35 and 70 years, 70 years and older to creat a new column "Age_Group", then sort the numbers of each group (10 points)

6. Creat a pie chart for the Age_Group to show the distribution (5 points)

7. Split the data into 2 sets with fraction of 0.8 and 0.2 respectively to show the numbers of each set (5 points)

8. Plot the scatter matrix between average number of rooms per dwelling, weighted distances to five Boston employment centres and the median market value of the owner-occupied houses (5 points)

Reference: labels for the dataset

The following describes the dataset variables:

CRIM - per capita crime rate by town

ZN - proportion of residential land zoned for lots over 25,000 sq.ft.

INDUS - proportion of non-retail business acres per town.

CHAS - Charles River dummy variable (1 if near the river; 0 otherwise)

NOX - nitric oxides concentration (parts per 10 million)

RM - average number of rooms per dwelling

AGE - proportion of owner-occupied units built prior to 1940

DIS - weighted distances to five Boston employment centres

RAD - index of accessibility to radial highways (higher index indicates easier access to radial highways)

TAX - full-value property-tax rate per $10,000

PTRATIO - pupil-teacher ratio by town

LSTAT - % lower status of the population

MEDV - Median Market value of owner-occupied houses in $10,000

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related General Management Questions!