Question: Using numpy and pandas Create a data frame with 5000 rows with 5 columns as follows: Cust per day -- random int between 25

Using numpy and pandas

Create a data frame with 5000 rows with 5 columns as follows:

● Cust per day -- random int between 25 and 150

● Site ID -- one of 5 values randomly ['001', '002' 'A02', 'B02', '003', 'B03']

○ first site needs to be 0.25 probability of occurring, last site needs to be 0.3

○ the other sites you should decide probabilities that sum to 1

● Merch Restock -- [0,1] ○ 75 % they need to restock (1)

● Fuel Restock -- [0,1] ○ 90% they need to restock (1)

● Daily Revenue -- Random floating point between 500 and 5000

Create a state column

● if site 001 or B02 set state to Rhode Island

● if site 002 or A02 set state to Montana

● all remaining set state to Alabama After adding this column, describe the data sets statistical distributions using the pandas function.

For daily revenue:

○ Create a sum column that contains the sum for that state on every row. All states should have the same sum

○ Create another column for the mean for that state

Step by Step Solution

3.38 Rating (154 Votes )

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock

Answers NOTE There is an issue in the question in the description of the site ID column as it says 5 ... View full answer

blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Accounting Questions!