Question: Use pandas library only: Task 1: Usin the skeleton code, create a subset of the data which removes 35% of the population. The 65% subset
Use pandas library only:
Task 1: Usin the skeleton code, create a subset of the data which removes 35% of the population. The 65% subset will be called the sample. The 25% subset will is named the validation set.
Task 2: Get the proportion of the population that is MALE and has a mass greater than or equal to a given weight in the sample.
NOTE: (given_weight = self.weight)
Task 3: Get proportion of population that is MALE and has a mass greater than or equal to a given weight in the sample. Assume this is the true value for the population and return the percent error (note a percentage not a proportion).
Task 4: Using any method you deem reasonable decide if it is reasonable to use this weight cutoff to predict if a troll is MALE for the supplied data.
NOTE: this returns True or False where True means it is reasonable and False means it is not.
Please explain each line of code.

import pandas as pd from assignment_1_grader import get_file, get_weight, h_assignment_1_grader, r class Assignment_1: def __init__(self): self.name = "INPUT YOUR NAME self.file = get_file() self.weight = get_weight() # answer 1 (sample - 75%, validation is 25%) self.sample_df, self.validation_df = self.get_validation_and_sample() # answer 2 (For this I will feed it your sample df) a_df = pd. DataFrame([]) seIf.get_probability_male_given_weight_greater_than_specified_weight(a_df) # answer 3 self.get_percent_error_of_sample_predicting_validation() # answer 4 self.evaluate_reasonableness_of_weight_as_a_predictor_of_gender_for_given_population_and_weight() h_assignment_1_grader(self) def get_validation_and_sample(self): file = self.file validation_df = pd. DataFrame([]) sample_df = pd. DataFrame([]) # load a big df # split data frame #head( int(df.shape[@]*.75)) # code that divides the file randomly into a sample (75%) and validation (25%) # you will be penalized if it is not random return sample_df, validation_df def get_probability_male_given_weight_greater_than_specified_weight(self, df): probability = r.random() weight = self.weight # code that assigns a value to probability # what equation? is this just averages... # get only those who are heavier than weight return probability def get_percent_error_of_sample_predicting_validation(self): percent_error = r.random() # code that calculates percent error # treat validation set as true value # percent_error = (tested - true)/true return percent_error def evaluate_reasonableness_of_weight_as_a_predictor_of_gender_for_given_population_and_weight(self): is_reasonable = r.choice([True, False]) weight = self.weight # code that assigns is reasonable True or False return is_reasonable
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
