Question: Create a python script to convert an object detection annotations csv file into a json file in the format of a dictionary with keys images

Create a python script to convert an object detection annotations csv file into a json file in the format of a dictionary with keys "images" and "annotations". Value for the "images" key should be a list of entries, one for each image of the form {"file_name": image_name, "height" : height, "width" : width, "id" : image_id}. Value of the "annotations" key should be a list of entries, one for each bounding box of the form {"image_id" : image_id, "bbox" : [xmin, ymin, xmax, ymax], "category_id" : bbox_label}. Our current csv file has the first column of data for the file_name, the second column has the bounding box or "bbox" for X, the third column has the bounding box or "bbox" for Y, the fourth column has the image width, the fifth column has the image height, and the sixth column has the region id or "category_id". Make sure the script is using pandas and or numpy to read in the values as whole numbers without any decimals at all and it verifies there's no decimals being added in or read in from the csv file. Try the script out with a csv file with some integers in columns to make sure the script works as described above.
Here's the code I have so far and it gave me the error of "error converting row 0: invalid literal for int() with base 10: '55.0'
and in the csv file the value is just '55' without the added decimal.
code:
import pandas as pd
import json
from pathlib import Path
def convert_csv_to_json(csv_file_path, json_file_path):
# Ensure the input and output file paths are Path objects
csv_file_path = Path(csv_file_path)
json_file_path = Path(json_file_path)
# Check if the CSV file exists
if not csv_file_path.is_file():
print(f"Error: The file {csv_file_path} does not exist!")
return
# Initialize lists for images and annotations
images =[]
annotations =[]
image_id_map ={} # To track and assign unique IDs to each image
image_id_counter =0 # Counter for image IDs
# Read the CSV file using pandas
try:
df = pd.read_csv(csv_file_path, dtype=str) # Read all columns as strings
except Exception as e:
print(f"Error reading CSV file: {e}")
return
for index, row in df.iterrows():
# Ensure that each row has the expected number of columns
if len(row)<6:
print(f"Warning: Row {index} does not have enough columns. Skipping...")
continue # Optionally skip rows with insufficient columns
# Extract data from the DataFrame row
file_name = row[0].strip() # File name as string
# Initialize variables
bbox_x = bbox_y = image_width = image_height = category_id = None
# Extract integer values from the row, converting strings to integers
try:
# Remove any decimal points and convert to int
bbox_x = int(float(row[1].strip().replace('.0',''))) # Bounding box X
bbox_y = int(float(row[2].strip().replace('.0',''))) # Bounding box Y
image_width = int(float(row[3].strip().replace('.0',''))) # Image width
image_height = int(float(row[4].strip().replace('.0',''))) # Image height
category_id = int(float(row[5].strip().replace('.0',''))) # Category ID
except ValueError as e:
print(f"Error converting row {index}: {e}. Assigning default values (0 for bbox).")
bbox_x = bbox_y =0
image_width = image_height =0
category_id =-1 # Assign a default category ID
# Check if the image is already added; if not, add it to the images list
if file_name not in image_id_map:
image_entry ={
"file_name": file_name,
"height": image_height,
"width": image_width,
"id": image_id_counter
}
images.append(image_entry)
image_id_map[file_name]= image_id_counter
image_id_counter +=1
# Get the image_id for this image
image_id = image_id_map[file_name]
# Create entry for the bounding box annotation
annotation_entry ={
"image_id": image_id,
"bbox": [bbox_x, bbox_y, bbox_x + image_width, bbox_y + image_height],
"category_id": category_id
}
annotations.append(annotation_entry)
# Combine images and annotations into the final dictionary
output_dict ={
"images": images,
"annotations": annotations
}
# Write the output dictionary to the JSON file
with json_file_path.open(mode='w') as f:
json.dump(output_dict, f, indent=4)
print(f"Successfully converted {csv_file_path} to {json_file_path}")
# Specify your input and output file paths here
csv_path ="C:\\path\\to\\your\\annotations.csv" # Change this to your CSV file path
json_path ="C:\\path\\to\\your\\output.json" # Change this to your desired JSON file path
# Run the conversion function
convert_csv_to_json(csv_path, json_path)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!