How to create custom COCO data set for instance segmentation



In this post, I will show you how simple it is to create your custom COCO dataset and train an instance segmentation model quick for free with Google Colab's GPU.

If you just want to know how to create custom COCO data set for object detection, check out my previous tutorial.

Instance segmentation is different from object detection annotation since it requires polygonal annotations instead of bound boxes. There are many tools freely available, such as labelme and coco-annotator. labelme is easy to install and runs on all major OS, however, it lacks native support to export COCO data format annotations which are required for many model training frameworks/pipelines. coco-annotator, on the other hand, is a web-based application which requires additional efforts to get it up and running on your machine. So way takes the least effort?

Here is an overview of how you can make your own COCO dataset for instance segmentation.

  • Download labelme, run the application and annotate polygons on your images.
  • Run my script to convert the labelme annotation files to COCO dataset JSON file.

Annotate data with labelme

labelme is quite similar to labelimg in bounding annotation. So anyone familiar with labelimg, start annotating with labelme should take no time.

You can install labelme like below or find prebuild executables in the release sections, or download the latest Windows 64bit executable I built earlier.

# python3
conda create --name=labelme python=3.6
source activate labelme
# or "activate labelme" on Windows
# conda install -c conda-forge pyside2
# conda install pyqt
pip install pyqt5  # pyqt5 can be installed via pip on python3
pip install labelme

When you open the tool, click the "Open Dir" button and navigate to your images folder where all image files are located then you can start drawing polygons. To finish drawing a polygon, press "Enter" key, the tool should connect the first and last dot automatically. When done annotating an image, press shortcut key "D" on the keyboard will take you to the next image. I annotated 18 images, each image containing multiple objects, it took me about 30 minutes.


Once you have all images annotated, you can find a list of JSON file in your images directory with the same base file name. Those are labelimg annotation files, we will convert them into a single COCO dataset annotation JSON file in the next step.(Or two JSON files for train/test split.)

Convert labelme annotation files to COCO dataset format

You can find the file on my GitHub. To apply the conversion, it is only necessary to pass in one argument which is the images directory path.

python images

The script depends on three pip packages: labelme, numpy, and pillow. Go ahead and install them with pip if you are missing any of them. After executing the script, you will find a file named trainval.json located in the current directory, that is the COCO dataset annotation JSON file.

Then optionally, you can verify the annotation by opening the COCO_Image_Viewer.ipynb jupyter notebook.

If everything works, it should show something like below.


Train an instance segmentation model with mmdetection framework

If you are unfamiliar with the mmdetection framework, it is suggested to give my previous post a try - "How to train an object detection model with mmdetection". The framework allows you to train many object detection and instance segmentation models with configurable backbone networks through the same pipeline, the only thing necessary to modify is the model config python file where you define the model type, training epochs, type and path to the dataset and so on. For instance segmentation models, several options are available, you can do transfer learning with mask RCNN or cascade mask RCNN with the pre-trained backbone networks. To make it even beginner-friendly, just run the Google Colab notebook online with free GPU resource and download the final trained model. The notebook is quite similar to the previous object detection demo, so I will let you run it and play with it.

Here is the final prediction result after training a mask RCNN model for 20 epochs, which took less than 10 minutes during training.


Feel free to try with other model config files or tweak the existing one by increasing the training epochs, change the batch size and see how it might improve the results. Also notice that for the simplicity and the small size of the demo dataset, we skipped the train/test split, where you can accomplish that by manually split the labelme JSON files into two directories and run the script for each directory to generate two COCO annotation JSON files.

Conclusion and further reading

Training an instance segmentation might look daunting since doing so might require a significant amount of computing and storage resources. But that's not keeping us away from creating one with around 20 annotated images and Colab's free GPU.

Resources you might find useful

My GitHub repo for the labelme2coco script, COCO image viewer notebook, and my demo dataset files.

labelme Github repo where you can find more information about the annotation tool.

The notebook you can run to train a mmdetection instance segmentation model on Google Colab.

Go to the mmdetection GitHub repo and know more about the framework.

My previous post - How to train an object detection model with mmdetection

Currently unrated