How to run Keras model on Jetson Nano in Nvidia Docker container

(Comments)

docker_nano

I wrote, "How to run Keras model on Jetson Nano" a while back, where the model runs on the host OS. In this tutorial, I will show you how to start fresh and get the model running on Jetson Nano inside an Nvidia docker container.

You might wonder why bother with docker on Jetson Nano? I came up with several reasons.

1. It's much easier to reproduce the results with a docker container compared with installing the dependencies/libraries all by yourself. Since the docker image you pull from Docker Hub has all dependencies preinstalled which save you tons of time building from source.

2. It's less likely to mess up the Jetson Nano host OS since your code and dependencies are isolated from it. Even when you get into trouble, solving the issue is just restarting a new container away.

3. You can build your applications based on my base image with TensorFlow preinstalled in a much more controllable way by creating a new docker file.

4. You can cross-compile the Docker image with a much power computer such as an X86 based server, saves valuable time.

5. Finally, you guessed it, running code in Docker container is almost as speedy as running on the host OS with GPU acceleration available.

Hope you are convinced, here is a brief overview of how to make it happen.

  • Install new JetPack 4.2.1 on Jetson Nano.
  • Cross-compiling Docker build setup on an X86 machine.
  • Build a Jetson Nano docker with TensorFlow GPU.
  • Build an overlay Docker image(Optional).
  • Run the frozen Keras TensorRT model in a Docker container.

Install new JetPack 4.2.1 on Jetson Nano

Download the JetPack 4.2.1 SD card image from Nvidia. Extract the sd-blob-b01.img file from the zip. Flash it to a class 10 32GB minimal SD card with Rufus. The SD card I have is a SanDisk class10 U1 64GB model.

rufus

You can try another flasher like Etcher, but I the SD card I flashed with Etcher cannot boot on Jetson Nano. I also tried installing the JetPack with SDK manager but running into an issue with the "System configuration wizard". There is the thread I opened in the Nvidia Developer forum, their technical support is quite responsive.

Insert the SD card, plug in an HDMI monitor cable, USB keyboard, and mouse, then power up the board. Follow the system configuration wizard to finish the system configuration.

Cross-compiling Docker build setup on an X86 machine

Even though the Nvidia Docker runtime is pre-installed on the OS which allows you to build a Docker container right on the hardware. However, cross-compiling Docker on an X86 based machine can save a significant amount of building time considering larger processing power and network speed. So the one time set up for a cross-compiling environment is well worth the time. A docker container will be built on the server, pushed to a Docker registry such as the Docker Hub, then pulled from Jetson Nano.

On your X86 machine, it could be your laptop or a Linux server, install Docker first following the official instruction.

Then install qemu from the command line, qemu will emulate Jetson Nano CPU architecture(which is aarch64) on your X86 machine when building Docker containers.

sudo apt-get install -y qemu binfmt-support qemu-user-static
wget http://archive.ubuntu.com/ubuntu/pool/main/b/binfmt-support/binfmt-support_2.1.8-2_amd64.deb
sudo apt install ./binfmt-support_2.1.8-2_amd64.deb
rm binfmt-support_2.1.8-2_amd64.deb

Finally, install podman. We will use that to build containers instead of the default docker container command-line interface.

sudo apt update
sudo apt -y install software-properties-common
sudo add-apt-repository -y ppa:projectatomic/ppa
sudo apt update
sudo apt -y install podman

Build a Jetson Nano Docker with TensorFlow GPU

We build our TensorFlow GPU Docker image based on the official nvcr.io/nvidia/l4t-base:r32.2 image.

Here is the content of Dockerfile.

FROM nvcr.io/nvidia/l4t-base:r32.2
WORKDIR /
RUN apt update && apt install -y --fix-missing make g++
RUN apt update && apt install -y --fix-missing python3-pip libhdf5-serial-dev hdf5-tools
RUN apt update && apt install -y python3-h5py
RUN pip3 install --pre --no-cache-dir --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v42 tensorflow-gpu
RUN pip3 install -U numpy
CMD [ "bash" ]

Then you can pull the base image, build and push the container image to Docker Hub like this. 

podman pull nvcr.io/nvidia/l4t-base:r32.2
podman build -v /usr/bin/qemu-aarch64-static:/usr/bin/qemu-aarch64-static -t docker.io/zcw607/jetson:0.1.0 . -f ./Dockerfile
podman push docker.io/zcw607/jetson:0.1.0

Change zcw607 to your own Docker Hub account name as necessary, you might have to do docker login docker.io first before you can push to the registry.

Build an overlay Docker image(Optional)

By building an overlay Docker image, you can add your code dependencies/libraries based on a previous Docker image.

For example, you want to install the Python pillow library and set up some other stuff, you can create a new Dockerfile like this.

FROM zcw607/jetson:0.1.0
WORKDIR /home
ENV TZ=Asia/Hong_Kong
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone \
apt update && apt install -y python3-pil
CMD [ "bash" ]

Then run those two lines to build and push the new container.

podman build -v /usr/bin/qemu-aarch64-static:/usr/bin/qemu-aarch64-static -t docker.io/zcw607/jetson:r1.0.1 . -f ./Dockerfile
podman push docker.io/zcw607/jetson:r1.0.1

Now your two Docker containers reside in Docker Hub, let's sync up on Jetson Nano.

Run TensorRT model in a Docker container

In Jetson Nano command line, pull the Docker container from Docker Hub like this.

docker pull docker.io/zcw607/jetson:r1.0.1

Then start the container with the following command.

docker run --runtime nvidia --network host -it -e DISPLAY=$DISPLAY -v /tmp/.X11-unix/:/tmp/.X11-unix zcw607/jetson:r1.0.1

Check TensorFlow GPU is installed, type "python3" in the command then,

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

If everything works, it should print

tf_gpu

To run the TensorRT model inference benchmark, use my Python script. The model is converted from the Keras MobilNet V2 model for image classification. It achieves 30 FPS with 244 by 244 color image input. That is running in a Docker container, and it is even slightly faster compared with 27.18FPS running without a Docker container.

fps

Read my previous blog to learn more about how to create your TensorRT model from Keras.

Conclusion and further reading

This tutorial shows the complete process to get a Keras model running on Jetson Nano inside an Nvidia Docker container. You can also learn how to build a Docker container on an X86 machine, push to Docker Hub and pulled from Jetson Nano. Check out my GitHub repo for updated Dockerfile, build script and inference benchmark script.

Current rating: 5

Comments