Question: Analyzing data by group When you re analyzing a data set, often you want to calculate the metrics broken down by each category of another
Analyzing data by group
When youre analyzing a data set, often you want to
calculate the metrics broken down by each category of
another variable.
In the previous homework, you wrote a function
generateNumericSummary to summarize a numeric
variable by another categorical variable.
By generalizingextending the previous code, we will
design an data object to analyze a variable by another
categorical variable.
Class dataByGroup
dataByGroup abstract class
Contains two pandas series: dat and group
numericDataByGroup
Subclass of dataByGroup
dat should be numeric.
categoricalDataByGroup
Subclass of dataByGroup
dat should be categorical.
dataByGroupdataByGroup
numericDataByGroupnumericDataByGroup categoricalDataByGroupcategoricalDataByGroup
Class dataByGroup
Write a class definition for object dataByGroup with the
following specification:
Object attributes
dat: pandas Series
group: categorical pandas Series of the same length as dat
isBinary: Boolean TrueFalse indicating whether dat is binary.
methods
initself dat, group: takes dat and group as input data and initialize dat and
group of object dataByGroup.
strself: print dat and group combined together as a pandas DataFrame
isBinaryself: returns the value of isBinary. Accessor method
getNumMissingsself: returns the number of missing values in dat. Accessor
method
Class dataByGroup
Class header
import pandas as pd
import numpy as np
class dataByGroupobject:
def initself dat, group:
#write your code
# init should initialize three object attributes
# dat, group, isBinary
def strself:
#write your code
def isBinaryself:
#write your code
def getNumMissingsself:
#write your code
SubClass numericDataByGroup
Write a class definition for object numericDataByGroup
with the following specification:
Object attributes
Inherits all attributes of its superclass dataByGroup
methods
Inherits all methods of its superclass dataByGroup
getMeansself: returns means of dat across the different levels of
group.
Return data type: pandas Series
getSTDself: returns standard deviations of dat across the different
levels of group.
Return data type: pandas Series
SubClass numericDataByGroup
Class header
class numericDataByGroupdataByGroup:
def initselfdat, group:
#write your code
def getMeansself:
#write your code
def getSTDself:
#write your code
SubClass categoricalDataByGroup
Write a class definition for object
categoricalDataByGroup with the following
specification:
Object attributes
Inherits all attributes of its superclass dataByGroup
Methods
Inherits all methods of its superclass dataByGroup
getTalliesself: returns tabulated counts tallies by dat and group
Return data type: pandas Series
SubClass categoricalDataByGroup
Class header
class categoricalDataByGroupdataByGroup:
def initselfdat, group:
#write your code
def getTalliesself:
#write your code
Output
Test case : categorical data
def main:
titanic pdreadcsvtitaniccsvheader
survivedByPclass categoricalDataByGrouptitanicsurvivedtitanicpclass
printData and Group:
printsurvivedByPclass ## str is invoked
printIs the data binary? : strsurvivedByPclassisBinary
printThe number of missing values : strsurvivedByPclassgetNumMissings
printTallies:
printsurvivedByPclassgetTallies
Test case : numerical data
def main:
ageBySurvived numericDataByGrouptitanicagetitanicsurvived
printData and Group:
printageBySurvived ## str is invoked
printIs the data binary? : strageBySurvivedisBinary
printThe number of missing values : strageBySurvivedgetNumMissings
printMeans:
printageBySurvivedgetMeans
printStandard Deviations:
printageBySurvivedgetSTD
Here is my code:
import numpy as np
import pandas as pd
class dataByGroupobject:
def initself dat, group:
self.dat dat
self.group group
def strself:
dfdat pdDataFrameselfdat.name: self.dat
dfgroup pdDataFrameselfgroup.name: self.group
return dfmerged.tostring
def isBinaryself:
lst nparrayselfdat
lstcheck nplogicalorlst lst
lstchecknan npisnanlst
return lstfinalcheck Trueall
def getNumMissingsself:
isnull
checknulldf pdisnullselfdat
for i in rangelenselfdat:
if checknulldfi True:
isnull
return isnull
class numericDataByGroupdataByGroup:
def initself dat, group:
dataByGroup.initself dat, group
def getMeansself:
return self.dat.groupbyselfgroupmean
def getSTDself:
return self.dat.groupbyselfgroupstd
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
