Question: ` ` ` dt = pandas.read _ csv ( ' . / co - est 2 0 1 9 - alldata.csv ' , encoding =

```
dt = pandas.read_csv('./co-est2019-alldata.csv', encoding='latn-1')
[62] data = df.to_dict('records')
db.populations.drop()
db.populations.insert_many(data)
print("Done!")
```
Done!
Using the aggregation pipeline and the \$out stage create a new dataset that just maps the state to total counts. Do this for all three data sets so you have:
casesdeaths_state =(state, cases, deaths)
populations_state \(=(\) state, population \()\)
vaccinations_state \(=(\) state, vaccinations)
```
`s [64] ies have a running sum by date, taking the max of each county, then summing by state is correct math)8
```
[66]\# Create the vaccinations_state collection (this dataset is by state and date. You don't want the sum of a'
Use the \$lookup stage of the aggregation pipeline to join your three data sets by state. Note this won't be a perfect join - to find out why look at the states or even the count of states in each set.
os [67](cases/population), death rate (deaths/population), vaccination rate (vaccinated_people/population).
[]\# Is there a correlation between infection or death rates with the vaccination rate for each state?
` ` ` dt = pandas.read _ csv ( ' . / co - est 2 0

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!