Question: To submit your work, go to File, then Save and export notebook as.... And export your code including the output to HTML. Download the file
To submit your work, go to "File", then "Save and export notebook as...". And export your code including the output to HTML.
Download the file shirts.csv. The data set contains information on shirt sales.
In the following questions, write code that would work on a file shirt.csv containing any number of rows.
Use pandas to preprocess the data set following the steps outlined in the notes preprocessing_steps.txt (found in Modules). Print the data set after each step.
feature selection: Delete the column titled 'id'. ordinal values: Map the size values to numbers. feature transformation: In the column 'collection', change 2023 to the value 1 (indicating a new collection), and the previous years to 0 (indicating an old collection). a. categorical values: Replace the column 'fabric' by three columns titled 'fabric_cotton', 'fabric_wool', and 'fabric_polyester'. b. categorical values: Replace the values in the column 'color' by their frequencies. feature aggregation: Combine the columns 'price' and 'tax' into one column 'total_price', by adding the price to the tax. For example, in the first row, the values 13 and 1.3 are replaced with 14.3.
| id | size | collection | fabric | color | price | tax |
| 0 | L | 2023 | cotton | orange | 13 | 1.3 |
| 1 | L | 2022 | wool | black | 16 | 1.6 |
| 2 | M | 2023 | wool | red | 20 | 2 |
| 3 | S | 2021 | polyester | white | 12 | 1.2 |
| 4 | M | 2021 | polyester | orange | 14 | 1.4 |
| 5 | S | 2023 | cotton | blue | 20 | 2 |
| 6 | S | 2022 | polyester | yellow | 20 | 2 |
| 7 | L | 2021 | polyester | white | 14 | 1.4 |
| 8 | M | 2021 | polyester | pink | 17 | 1.7 |
| 9 | L | 2023 | cotton | white | 15 | 1.5 |
| 10 | L | 2022 | cotton | black | 13 | 1.3 |
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
