Question: Language- Python, pandas 3e) Sort zipcodes into Geographic Subdivision The Safe Harbor Method applies to Geographic Subdivisions as opposed to each zipcode itself. Geographic Subdivision:

Language- Python, pandas

Language- Python, pandas 3e) Sort zipcodes into "Geographic Subdivision" The Safe HarborMethod applies to "Geographic Subdivisions" as opposed to each zipcode itself. Geographic

3e) Sort zipcodes into "Geographic Subdivision" The Safe Harbor Method applies to "Geographic Subdivisions" as opposed to each zipcode itself. Geographic Subdivision: All areas which share the first 3 digits of a zip code Count the total population for each geographic subdivision, storing the first 3 digits of the zip code and its corresponding population in the dictionary zip_dict . (For example, if there were 20 people whose zip code started with 090, the key-value pair in zip_dict would be {'090' : 20} .) You may be tempted to write a gnarly loop to accomplish this. Avoid that temptation. Instead, you'll want to be savy with a dictionary and groupby from pandas here. To get you started... If you wanted to group by whole zip code, you could use something like this: df_zip.groupby (df_zipl'zip']). But, we don't want to group by the entire zip code. Instead, we want to extract the first 3 digits of a zip code, and group by that. To extract the first three digits, you could so something like the following: df_zip['zip'].str[:3] You'll want to combine these two concepts, such that you store this information in a dictionary zip_dict , which stores the first three digits of the zip code as the key and the population of that 3-digit zip code as the value. (If you're stuck and/or to better understand how dictionaries work and how they apply to this concept, check the section materials, use google, and go to discussion sections!) assert isinstance(zip dict, dict) assert zip_dict['100'] == 1502501 3f) Masking the Zip Codes In this part, you should write a for loop, updating the df_users dataframe. Go through each user, and update their zip code, to Safe Harbor specifications: If the user is from a zip code for the which the "Geographic Subdivision" is less than equal to 20,000, change the zip code to O Otherwise, change the zip code to be only the first 3 numbers of the full zip code Do all this rewritting the zip_code columns of the df_users DataFrame Hints: 1. This will be several lines of code, looping through the DataFrame, getting each zip code, checking the geographic subdivision with the population in zip_dict, and setting the zip_code accordingly. 2. Be very aware of your variable types when working with zip codes here. ] : # YOUR CODE HERE raise NotImplementedError(). assert len(df_users) == 943 assert df_users.loc[671, 'zip'] == '285 3e) Sort zipcodes into "Geographic Subdivision" The Safe Harbor Method applies to "Geographic Subdivisions" as opposed to each zipcode itself. Geographic Subdivision: All areas which share the first 3 digits of a zip code Count the total population for each geographic subdivision, storing the first 3 digits of the zip code and its corresponding population in the dictionary zip_dict . (For example, if there were 20 people whose zip code started with 090, the key-value pair in zip_dict would be {'090' : 20} .) You may be tempted to write a gnarly loop to accomplish this. Avoid that temptation. Instead, you'll want to be savy with a dictionary and groupby from pandas here. To get you started... If you wanted to group by whole zip code, you could use something like this: df_zip.groupby (df_zipl'zip']). But, we don't want to group by the entire zip code. Instead, we want to extract the first 3 digits of a zip code, and group by that. To extract the first three digits, you could so something like the following: df_zip['zip'].str[:3] You'll want to combine these two concepts, such that you store this information in a dictionary zip_dict , which stores the first three digits of the zip code as the key and the population of that 3-digit zip code as the value. (If you're stuck and/or to better understand how dictionaries work and how they apply to this concept, check the section materials, use google, and go to discussion sections!) assert isinstance(zip dict, dict) assert zip_dict['100'] == 1502501 3f) Masking the Zip Codes In this part, you should write a for loop, updating the df_users dataframe. Go through each user, and update their zip code, to Safe Harbor specifications: If the user is from a zip code for the which the "Geographic Subdivision" is less than equal to 20,000, change the zip code to O Otherwise, change the zip code to be only the first 3 numbers of the full zip code Do all this rewritting the zip_code columns of the df_users DataFrame Hints: 1. This will be several lines of code, looping through the DataFrame, getting each zip code, checking the geographic subdivision with the population in zip_dict, and setting the zip_code accordingly. 2. Be very aware of your variable types when working with zip codes here. ] : # YOUR CODE HERE raise NotImplementedError(). assert len(df_users) == 943 assert df_users.loc[671, 'zip'] == '285

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!