(Comments)
In this tutorial, you will learn how to train a custom object detection model easily with TensorFlow object detection API and Google Colab's free GPU.
Annotated images and source code to complete this tutorial are included.
Otherwise, let's start with creating the annotated datasets.
In my case, I use my iPhone to take those photos, each come with 4032 x 3024 resolution, it will overwhelm the model if we use that as direct input to the model. Instead, resize those photos to uniformed (800, 600)
You can use
First, save your photos, ideally jpg
./data/raw
python resize_images.py --raw-dir ./data/raw --save-dir ./data/images --ext jpg --target-size "(800, 600)"
Resized images will locate in ./data/images/
Next, we split those files into two directories, ./data/images/train
./data/images/test
Annotate resized images with xml
./data/images/train
./data/images/test
Tips: use shortcuts (w
: draw d
a
tfrecord
files (source included in Colab notebook)After running this step, you will have two train.record
test.record
There are two steps in doing so:
*.xml
*.csv
*.csv
*.record
Use the following scripts to generate tfrecord
label_map.pbtxt
# Convert train folder annotation xml files to a single csv file,
# generate the `label_map.pbtxt` file to `data/annotations` directory as well.
python xml_to_csv.py -i data/images/train -o data/annotations/train_labels.csv -l data/annotations
# Convert test folder annotation xml files to a single csv.
python xml_to_csv.py -i data/images/test -o data/annotations/test_labels.csv
# Generate `train.record`
python generate_tfrecord.py --csv_input=data/annotations/train_labels.csv --output_path=data/annotations/train.record --img_path=data/images/train --label_map data/annotations/label_map.pbtxt
# Generate `test.record`
python generate_tfrecord.py --csv_input=data/annotations/test_labels.csv --output_path=data/annotations/test.record --img_path=data/images/test --label_map data/annotations/label_map.pbtxt
Instead of training the model from scratch, we will do transfer learning from a model pre-trained to detect everyday objects.
Transfer learning requires less training data compared to training from scratch.
But keep in mind transfer learning technique supposes your training data is somewhat similar to the ones used to train the base model. In our case, the base model is trained with coco dataset of common objects, the 3 target objects we want to train the model to detect are fruits and nuts, i.e. "date", "fig" and "hazelnut". They are similar to ones in coco datasets. On the other hand, if your target objects are lung nodules in CT images, transfer learning might not work so well since they are entirely different compared to coco dataset common objects, in that case, you probably need much more annotations and train the model from scratch.
To do the transfer learning training, we first will download the pre-trained model weights/checkpoints and then config the corresponding pipeline config file to tell the trainer about the following information.
After that, we can start the training, where the model_dir is the path of a new directory to store our output model.
!python /content/models/research/object_detection/model_main.py \
--pipeline_config_path={filename} \
--model_dir={model_dir} \
--alsologtostderr \
--num_train_steps={num_steps} \
--num_eval_steps={num_eval_steps}
Inside the colab notebook, TensorBoard is also configured to help you visualize the training progress and results. Here are two screenshots of TensorBoard show the prediction on test images and monitor of loss value.
Once your training job is complete, you need to extract the newly trained model as an inference graph, which will be later used to perform the object detection. The conversion can be done as follows:
!python /content/models/research/object_detection/export_inference_graph.py \
--input_type=image_tensor \
--pipeline_config_path=/content/models/research/object_detection/samples/configs/faster_rcnn_inception_v2_pets.config \
--output_directory=fine_tuned_model \
--trained_checkpoint_prefix={last_model_path}
Training an object detection model can be resource intensive and time-consuming. This tutorial shows you it can be as simple as annotation 20 images and run a Jupyter notebook on Google Colab. In the future, we will look into deploying the trained model in different hardware and benchmark their performances. To name a few deployment options,
Stay tuned and don't forget to check out the GitHub repository and the Google Colab Notebook for this tutorial.
Comments