Question: Python Help Needed! Without importing pandas, I need to read 3 different file types and combine into one standard format for data analysis. There are
Python Help Needed! Without importing pandas, I need to read 3 different file types and combine into one standard format for data analysis.
There are 3 data files named data.csv, data.json, and data.pkl
data.csv has the following data:
| Name | Phone | Address | City | Country | ||
| 0 | Hillary Benton | 1-243-669-7472 | 144-1225 In Road | Navsari | Togo | rutrum.magna.Cr@eud.edu |
| 1 | Morgan Y. Little | 155-3483 | Ap #909-6656 Ac St. | Kitimat | Nauru | pede.sagittis.aug@quis.ca |
data.json has the following data:
{"Name":{"2":"Paul Merrill","3":"Brynne S. Barr"},"Phone":{"2":"1-313-739-3854","3":"939-4818"},"Address":{"2":"916-8087 Vehicula Rd.","3":"878-2231 Suspendisse Rd."},"City":{"2":"Le Mans","3":"Wilhelmshaven"},"Country":{"2":"Somalia","3":"Samoa"},"Email":{"2":"diam.Pellentesque@suscipitest.ca","3":"euismod.et.commodo@nisi.co.uk"}
data.pkl has the following data:
[{'Phone': {4: '420-1477', 5: '102-2189'}, 'Email': {4: 'ipsum.ac@quam.net', 5: 'Nul.fac.Sus@u rnanec.net'}, 'Name': {4: 'Garrison Lindsey', 5: 'Jenna Mercado'}, 'Address': {4: 'P.O. Box 466, 7919 In Av.', 5: 'P.O. Box 484, 9648 Sit Avenue'}, 'Country': {4: 'Zambia', 5: 'Burkina Faso'}, 'City': {4: 'Dunbar', 5: 'Pollena Trocchia'}}
As you can see, each file format has its peculiarities. Each file contains a portion of the total dataset that altogether comprises 5 records, so I need to read in all of the files and combine them into some standard format with which I am comfortable with. I am thinking on combining all of the files into json format and aiming for something standard where each "row" is represented in the same format. I would need to name this object that contains data for all three files combined as full_data.
I currently have the following code but not sure how to format the combined object to make it more streamlined in the same standard format:
import json import csv import pickle
full_data = []
with open("data.json", "rb") as openfile: full_data.append(json.load(openfile))
with open("data.pkl", "rb") as openfile: full_data.append(pickle.load(openfile))
with open("data.csv", encoding="utf-8") as openfile: csvReader = csv.DictReader(openfile) #convert each csv row into python dict for row in csvReader: #add this python dict to json array full_data.append(row)
print(full_data)
I am thinking that in between my open command and the full_data.append(item), I can manipulate the data to a desired format row by row and then index the data to pull out specific fields (e.g., name, phone, etc) or use a dictionary key:value pair row[name..] then rebuild the row to the desired full_data format and append to the master list. Can anyone help please?? I am struggling with this..
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
