How to run TensorFlow Object Detection model on Jetson Nano

(Comments)

tf-jetson-nano

Previously, you have learned how to run a Keras image classification model on Jetson Nano, this time you will know how to run a Tensorflow object detection model on it. It could be a pre-trained model in Tensorflow detection model zoo which detects everyday object like person/car/dog, or it could be a custom trained object detection model which detects your custom objects.

For this tutorial, we will convert the SSD MobileNet V1 model trained on coco dataset for common object detection.

Here is a break down how to make it happen, slightly different from the previous image classification tutorial.

  1. Download pre-trained model checkpoint, build TensorFlow detection graph then creates inference graph with TensorRT.
  2. Loads the TensorRT inference graph on Jetson Nano and make predictions.

Those two steps will be handled in two separate Jupyter Notebook, with the first one running on a development machine and second one running on the Jetson Nano.

Before going any further make sure you have setup Jetson Nano and installed Tensorflow.

Step 1: Create TensorRT model

Run this step on your development machine with Tensorflow nightly builds which include TF-TRT by default or you can run on this Colab notebook's free GPU.

In the notebook, you will start with installing Tensorflow Object Detection API and setting up relevant paths. Its official installing documentation might look daunting to beginners, but you can also do it by running just one notebook cell.

%cd /content
!git clone --quiet https://github.com/tensorflow/models.git

!apt-get install -qq protobuf-compiler python-pil python-lxml python-tk

!pip install -q Cython contextlib2 pillow lxml matplotlib

!pip install -q pycocotools

%cd /content/models/research
!protoc object_detection/protos/*.proto --python_out=.

import os
import sys
os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'
sys.path.append("/content/models/research/slim/")

!python object_detection/builders/model_builder_test.py

Next, you will download and build a detection graph from the pre-trained ssd_mobilenet_v1_coco checkpoint or select another one from the list provided in the Notebook.

config_path, checkpoint_path = download_detection_model(MODEL, 'data')

frozen_graph, input_names, output_names = build_detection_graph(
    config=config_path,
    checkpoint=checkpoint_path,
    score_threshold=0.3,
    iou_threshold=0.5,
    batch_size=1
)

Initially, the default Tensorflow object detection model takes variable batch size, it is now fixed to 1 since the Jetson Nano is a resource-constrained device. In the build_detection_graph call, several other changes apply to the Tensorflow graph,

  • The score threshold is set to 0.3, so the model will remove any prediction results with confidence score lower than the threshold.
  • IoU(intersection over union) threshold is set to 0.5 so that any detected objects with same classes overlapped will be removed. You can read more about IoU(intersection over union) and non-max suppression here.
  • Apply modifications over the frozen object detection graph for improved speed and reduced memory consumption.

Next, we create a TensorRT inference graph just like the image classification model.

import tensorflow.contrib.tensorrt as trt

trt_graph = trt.create_inference_graph(
    input_graph_def=frozen_graph,
    outputs=output_names,
    max_batch_size=1,
    max_workspace_size_bytes=1 << 25,
    precision_mode='FP16',
    minimum_segment_size=50
)

Once you have the TensorRT inference graph, you can save it as pb file and download from Colab or your local machine into your Jetson Nano as necessary.

with open('./data/trt_graph.pb', 'wb') as f:
    f.write(trt_graph.SerializeToString())

# Download the tensorRT graph .pb file from colab to your local machine.
from google.colab import files

files.download('./data/trt_graph.pb')

Step 2: Loads TensorRT graph and make predictions

On your Jetson Nano, start a Jupyter Notebook with command jupyter notebook --ip=0.0.0.0 where you have saved the downloaded graph file to ./model/trt_graph.pb. The following code will load the TensorRT graph and make it ready for inferencing.

import tensorflow as tf

def get_frozen_graph(graph_file):
    """Read Frozen Graph file from disk."""
    with tf.gfile.FastGFile(graph_file, "rb") as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
    return graph_def

# The TensorRT inference graph file downloaded from Colab or your local machine.
pb_fname = "./model/trt_graph.pb"
trt_graph = get_frozen_graph(pb_fname)

input_names = ['image_tensor']

# Create session and load graph
tf_config = tf.ConfigProto()
tf_config.gpu_options.allow_growth = True
tf_sess = tf.Session(config=tf_config)
tf.import_graph_def(trt_graph, name='')

tf_input = tf_sess.graph.get_tensor_by_name(input_names[0] + ':0')
tf_scores = tf_sess.graph.get_tensor_by_name('detection_scores:0')
tf_boxes = tf_sess.graph.get_tensor_by_name('detection_boxes:0')
tf_classes = tf_sess.graph.get_tensor_by_name('detection_classes:0')
tf_num_detections = tf_sess.graph.get_tensor_by_name('num_detections:0')

Now, we can make a prediction with an image and see if the model gets it correctly. Notice we resized the image to 300 x 300, however, you can try other sizes or just keep the size unmodified since the graph can handle variable-sized input. But keep in mind, since the memory in Jetson is quite tiny compared to a desktop machine so it can hardly take large images.

import cv2
IMAGE_PATH = "./data/dogs.jpg"
image = cv2.imread(IMAGE_PATH)
image = cv2.resize(image, (300, 300))

scores, boxes, classes, num_detections = tf_sess.run([tf_scores, tf_boxes, tf_classes, tf_num_detections], feed_dict={
    tf_input: image[None, ...]
})
boxes = boxes[0]  # index by 0 to remove batch dimension
scores = scores[0]
classes = classes[0]
num_detections = int(num_detections[0])

If you have played around Tensorflow object detection API before, those outputs should look familiar.

Here the results might still contain overlapped predictions with different class labels. For example, the same object can be labeled with two classes in two overlapping bound boxes.

We will use a custom non-max suppression function to remove the overlapping bounding boxes with lower prediction score.

Let's visualize the result by drawing bounding boxes and labels overlays.

Here is the code to create the overlays and display on the Jetson Nano's Notebook.

from IPython.display import Image as DisplayImage

# Boxes unit in pixels (image coordinates).
boxes_pixels = []
for i in range(num_detections):
    # scale box to image coordinates
    box = boxes[i] * np.array([image.shape[0],
                               image.shape[1], image.shape[0], image.shape[1]])
    box = np.round(box).astype(int)
    boxes_pixels.append(box)
boxes_pixels = np.array(boxes_pixels)

# Remove overlapping boxes with non-max suppression, return picked indexes.
pick = non_max_suppression(boxes_pixels, scores[:num_detections], 0.5)


for i in pick:
    box = boxes_pixels[i]
    box = np.round(box).astype(int)
    # Draw bounding box.
    image = cv2.rectangle(
        image, (box[1], box[0]), (box[3], box[2]), (0, 255, 0), 2)
    label = "{}:{:.2f}".format(int(classes[i]), scores[i])
    # Draw label (class index and probability).
    draw_label(image, (box[1], box[0]), label)

# Save and display the labeled image.
save_image(image[:, :, ::-1])
DisplayImage(filename="./data/img.png")

results

In coco label map, class 18 means a dog and 23 is a bear. The two dogs sitting there are incorrectly classified as bears. Maybe there are more sitting bears than standing dogs in coco datasets.

A similar speed benchmark is carried out and Jetson Nano has achieved 11.54 FPS with the SSD MobileNet V1 model and 300 x 300 input image.

If you run into out of memory issue, try to boot up the board without any monitor attached and log into the shell with SSH so you can save some memory from the GUI.

Conclusion and further reading

In this tutorial, you learned how to convert a Tensorflow object detection model and run the inference on Jetson Nano.

Check out the updated GitHub repo for the source code.

If you are not satisfied with the results, there are other pre-trained models for you to take a look at, I recommend you start with SSD MobileNet V2(ssd_mobilenet_v2_coco), or if you are adventurous, try ssd_inception_v2_coco which might push the limits of Jetson Nano's memory.

You can find those models in Tensorflow detection model zoo, the "Speed (ms)" metric will give you a guideline on the complexity of the model.

Thinking about training your custom object detection model with a free data center GPU, check out my previous tutorial - How to train an object detection model easy for free.

Current rating: 4.6

Comments