Question: Question 1 : Import data The url for the data is: https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-ST0151EN-SkillsNetwork/labs/boston_housing.csv 1. Please import the data and define&transform it into a dataframe (5 points)
Question 1: Import data
The url for the data is: https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-ST0151EN-SkillsNetwork/labs/boston_housing.csv
1. Please import the data and define&transform it into a dataframe (5 points)
###Question 2: View on the data
1. Show the first 10 and last 10 rows of the data (5 points)
2. Give the summary table of the data (5 points)
3. Find the number of rows and columns of the dataframe (5 points)
4. Drop the Unnamed: 0 column (5 points)
5. Sort the table by crime rate with ascending order (5 points)
6. Find the number of houses near the Charles River (5 points)
7. Extract per capita crime rate by town and pupil-teacher ratio by town and construct a new daraframe (5 points)
Question 3: Generate Descriptive Statistics and Visualizations
1. Find Interquantile range of houses' weighted distances to five Boston employment centres (5 points)
2. For the 'Median value of owner-occupied homes' provide a boxplot (5 points)
3. Provide a histogram for the index of accessibility to radial highways variable and state what the histogram has been indicating (10 points)
####4. Provide a scatter plot to show the relationship between Nitric oxide concentrations and the proportion of non-retail business acres per town. What can you say about the relationship? (10 points)
5. Discretize the age variable into three groups: 35 years and younger, between 35 and 70 years, 70 years and older to creat a new column "Age_Group", then sort the numbers of each group (10 points)
6. Creat a pie chart for the Age_Group to show the distribution (5 points)
7. Split the data into 2 sets with fraction of 0.8 and 0.2 respectively to show the numbers of each set (5 points)
8. Plot the scatter matrix between average number of rooms per dwelling, weighted distances to five Boston employment centres and the median market value of the owner-occupied houses (5 points)
Reference: labels for the dataset
The following describes the dataset variables:
CRIM - per capita crime rate by town
ZN - proportion of residential land zoned for lots over 25,000 sq.ft.
INDUS - proportion of non-retail business acres per town.
CHAS - Charles River dummy variable (1 if near the river; 0 otherwise)
NOX - nitric oxides concentration (parts per 10 million)
RM - average number of rooms per dwelling
AGE - proportion of owner-occupied units built prior to 1940
DIS - weighted distances to five Boston employment centres
RAD - index of accessibility to radial highways (higher index indicates easier access to radial highways)
TAX - full-value property-tax rate per $10,000
PTRATIO - pupil-teacher ratio by town
LSTAT - % lower status of the population
MEDV - Median Market value of owner-occupied houses in $10,000
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
