Question: import pandas as pd import numpy as np from scipy.stats import pearsonr import matplotlib.pyplot as plt # Read data files house _ prices _ df
import pandas as pd
import numpy as np
from scipy.stats import pearsonr
import matplotlib.pyplot as plt
# Read data files
housepricesdf pdreadexcelMeanHousePricesCleanxlsx
crimedf pdreadexcelCrimeCleanxlsx
populationdf pdreadexcelPopulationCleanxlsx
areadf pdreadexcelSuburbAreasxlsx
# Step B: Clean and prepare data
def preparedatadf columns:
df dfdropnasubsetcolumns # Remove rows with missing values in key columns
return df
# Rename columns for consistency
housepricesdf housepricesdfrenamecolumnsYear: 'year'
crimedf crimedfrenamecolumns
'Year': 'year',
'Crime rate per population': 'crimerate',
'Local Government Area': 'localgovernmentarea'
populationdf populationdfrenamecolumnsYear: 'year'
areadf areadfrenamecolumnsProperty: 'localgovernmentarea'
# Clean the area DataFrame to remove nonrelevant rows
areadf areadfareadflocalgovernmentarea' 'Area sq Km
# Step C: Analysis functions
def analyzecorrelationdf col col:
df dfdropnasubsetcol col
if lendf:
return npnan
correlation, pearsonrdfcol dfcol
return correlation
# Reshape housepricesdf to long format
housepriceslong housepricesdfmeltidvarsyear varname'localgovernmentarea', valuename'meanhouseprice'
# Reshape populationdf to long format
populationlong populationdfmeltidvarsyear varname'localgovernmentarea', valuename'population'
# Reshape areadf to long format
arealong areadfmeltidvarslocalgovernmentarea' varname'year', valuename'area'
# Merge the datasets on 'year' and 'localgovernmentarea'
mergeddf pdmergecrimedf housepriceslong, onyear 'localgovernmentarea' how'inner'
mergeddf pdmergemergeddf populationlong, onyear 'localgovernmentarea' how'inner'
mergeddf pdmergemergeddf arealong, on'localgovernmentarea', how'inner'
# Calculate population density
mergeddfpopulationdensity' mergeddfpopulation mergeddfarea
# Step D: Prepare the data by cleaning
mergeddf preparedatamergeddfmeanhouseprice', 'crimerate', 'populationdensity'
# Step E: Perform correlation analysis
housepricepopulationcorr analyzecorrelationmergeddf 'meanhouseprice', 'populationdensity'
crimehousepricecorr analyzecorrelationmergeddf 'crimerate', 'meanhouseprice'
crimepopulationdensitycorr analyzecorrelationmergeddf 'crimerate', 'populationdensity'
# Step F: Print the results
printfCorrelation between house prices and population density: housepricepopulationcorr
printfCorrelation between crime rate and house prices: crimehousepricecorr
printfCorrelation between crime rate and population density: crimepopulationdensitycorr
# Plotting for visual analysis
pltfigurefigsize
pltscattermergeddfpopulationdensity' mergeddfmeanhouseprice'
plttitleHouse Price vs Population Density'
pltxlabelPopulation Density people per square km
pltylabelMean House Price'
pltgridTrue
pltshow
pltfigurefigsize
pltscattermergeddfmeanhouseprice' mergeddfcrimerate'
plttitleCrime Rate vs House Price'
pltxlabelMean House Price'
pltylabelCrime Rate per population
pltgridTrue
pltshow
pltfigurefigsize
pltscattermergeddfpopulationdensity' mergeddfcrimerate'
plttitleCrime Rate vs Population Density'
pltxlabelPopulation Density people per square km
pltylabelCrime Rate per population
pltgridTrue
pltshow
The above python code is run in google colab and it produces nil output as nan modify and correct the code and add some more correlation if it helps to achieve my output. chegg expert gave me the code above, it is running free of errors but my plots are nil and the corelation is shown as Nan. help me i have access to all the excel sheets of data as attached. something is wrong since it keeps on producing Nan
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
