How to create custom COCO data set for object detection

(Comments)

voc2coco

Previously, we have trained a mmdetection model with custom annotated dataset in Pascal VOC data format. You are out of luck if your object detection training pipeline require COCO data format since the labelImg tool we use does not support COCO annotation format. If you still want to stick with the tool for annotation and later convert your annotation to COCO format, this post is for you.

We will start with a brief introduction to the two annotation formats, followed with an introduction to the conversion script to convert VOC to COCO format, finally, we will validate the converted result by plotting the bounding boxes and class labels.

Pascal VOC and COCO annotations

Pascal VOC annotations are saved as XML files, one XML file per image. For an XML file generated by the labelImg tool. It contains the path to the image in the <path> element. Each bounding box is stored in an <object> element, an example can look like below.

<object>
	<name>fig</name>
	<pose>Unspecified</pose>
	<truncated>0</truncated>
	<difficult>0</difficult>
	<bndbox>
		<xmin>256</xmin>
		<ymin>27</ymin>
		<xmax>381</xmax>
		<ymax>192</ymax>
	</bndbox>
</object>

As you can see the bounding box is defined by two points, the upper left and bottom right corners.

For the COCO data format, first of all, there is only a single JSON file for all the annotation in a dataset or one for each split of datasets(Train/Val/Test).

The bounding box is express as the upper left starting coordinate and the box width and height, like "bbox" :[x,y,width,height].

Here is an example for the COCO data format JSON file which just contains one image as seen the top-level "images" element, 3 unique categories/classes in total seen in top-level "categories" element and 2 annotated bounding boxes for the image seen in top-level "annotations" element.

{
  "type": "instances",
  "images": [
    {
      "file_name": "0.jpg",
      "height": 600,
      "width": 800,
      "id": 0
    }
  ],
  "categories": [
    {
      "supercategory": "none",
      "name": "date",
      "id": 0
    },
    {
      "supercategory": "none",
      "name": "hazelnut",
      "id": 2
    },
    {
      "supercategory": "none",
      "name": "fig",
      "id": 1
    }
  ],
  "annotations": [
    {
      "id": 1,
      "bbox": [
        100,
        116,
        140,
        170
      ],
      "image_id": 0,
      "segmentation": [],
      "ignore": 0,
      "area": 23800,
      "iscrowd": 0,
      "category_id": 0
    },
    {
      "id": 2,
      "bbox": [
        321,
        320,
        142,
        102
      ],
      "image_id": 0,
      "segmentation": [],
      "ignore": 0,
      "area": 14484,
      "iscrowd": 0,
      "category_id": 0
    }
  ]
}

Convert Pascal VOC to COCO annotation

Once you have some annotated XML and images files, put them in the following folder structures similar the one below,

data
└── VOC2007
├── Annotations
│ ├── 0.xml
│ ├── ...
│ └── 9.xml
└── JPEGImages
├── 0.jpg
├── ...
└── 9.jpg

Then you can run the voc2coco.py script from my GitHub like this which will generate a COCO data formatted JSON file for you.

python voc2coco.py ./data/VOC/Annotations ./data/coco/output.json

Once we have the JSON file, we can visualize the COCO annotation by drawing bounding box and class labels as an overlay over the image. Open the COCO_Image_Viewer.ipynb in Jupyter notebook. Find the following cell inside the notebook which calls the display_image method to generate an SVG graph right inside the notebook.

html = coco_dataset.display_image(0, use_url=False)
IPython.display.HTML(html)

The first argument is the image id, for our demo datasets, there are totally 18 images, so you can try setting it from 0 to 17.

vis_8

Conclusion and further reading

In this quick tutorial, you have learned how you can stick with the popular labelImg for custom dataset annotation and later convert the Pascal VOC to COCO dataset to train an object detection model pipeline requires COCO format datasets.

You might find the following links useful,

How to train an object detection model with mmdetection - my previous post about creating custom Pascal VOC annotation files and train an object detection model with PyTorch mmdetection framework. 

COCO data format

Pascal VOC  documentation

Download labelImg for the bounding box annotation.

Get the source code for this post, check out my GitHub repo.

Current rating: 4.2

Comments