(Comments)
Movidius neural compute stick(NCS) along with some other hardware devices like UP AI Core, AIY vision bonnet and the recently revealed Google edge TPU are gradually bringing deep learning to resource-constrained IOT devices. Are we one step closer to build a DIY hunter-killer drone you have been waiting? That powerful GPU used to enable serious deep learning image processing in the past can be shrunken down to a more plug and play size, think of it as a sort of mini neural network on the go. Stop me if this is beginning to sound a little too "Terminator" for comfort.
As a maker and programmer familiar with the Keras deep learning framework, odds are, you may be able to deploy the model you trained to NCS. In this tutorial, I will show you how easy it is to train a simple MNIST Keras model and deploy it to NCS, which could be connected to either a PC or Raspberry Pi.
There are several steps,
Let's have a look at each of them.
Just like the "Hello world!" first prints on your console, training a handwritten digits MNIST model is equivalent to deep learning programmers.
Here is a Keras model does the job just fine with several convolutional layers followed by a final output stage. The complete train-mnist.py code is on my GitHub while here is a quick snippet to show you the point.
from keras import layers, models
model = models.Sequential()
model.add(layers.Conv2D(16, 3, activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPool2D())
model.add(layers.Conv2D(32, 3, activation='relu'))
model.add(layers.MaxPool2D())
model.add(layers.Conv2D(64, 3, activation='relu'))
model.add(layers.MaxPool2D())
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(10, activation='softmax'))
model.summary()
model.compile(optimizer='adam', metrics=['accuracy'], loss='categorical_crossentropy')
history = model.fit(x_train, y_train, epochs=2, batch_size=128)
output = model.predict(test_image.reshape(1, 28, 28, 1))[0]
print("Keras \r\n", output, '\r\nPredicted:',output.argmax())
Training may take 3 minutes on GPU or longer on CPU, by the way, if you don't have a GPU training machine available now, you can check out my previous tutorial on how to train your model on Google's GPU free of charge, all you need is a Gmail account.
Either way, after training, save the model and weights into two separate files like this.
with open("model.json", "w") as file:
file.write(model.to_json())
model.save_weights("weights.h5")
Alternatively, you can call model.save('model.h5', include_optimizer=False)
include_optimizer
False
Since Movidius NCSDK2 only compiles either TensorFlow or Caffe model, we will peel away the Keras binding to the TensorFlow graph. The following code handles the work, let's see how it works in case you might want to customize it in the future.
from keras.models import model_from_json
from keras import backend as K
import tensorflow as tf
model_file = "model.json"
weights_file = "weights.h5"
with open(model_file, "r") as file:
config = file.read()
K.set_learning_phase(0)
model = model_from_json(config)
model.load_weights(weights_file)
saver = tf.train.Saver()
sess = K.get_session()
saver.save(sess, "./TF_Model/tf_model")
fw = tf.summary.FileWriter('logs', sess.graph)
fw.close()
First, we turn off the learning phase, then the model is loaded in the standard Keras way from two separate files we saved previously.
By K.get_session()
sess.graph.get_operations()
Each file serves a different purpose,
The mvNCCompile command line tool comes with NCSDK2 toolkit converts Caffe or Tensorflow networks to graph files that can be used by the Movidius Neural Compute Platform API. We will specify the input and output nodes as TensorFlow operation names for the mvNCCompile during the graph generation. You can find a list of TensorFlow operations by sess.graph.get_operations()
Finally, the compile command will look like this,
mvNCCompile TF_Model/tf_model.meta -in=conv2d_1_input -on=dense_2/Softmax
A default graph file named "graph" will be generated at the current directory.
The NCSDK2 Python API takes over, find an NCS device, connect, allocate the graph to its memory and make a prediction.
The following code shows the essential part, and input_img
The output is the same as Keras, ten numbers representing the classification probabilities for each of the ten digits, we argmax
from mvnc import mvncapi as mvnc
# get the first NCS device by its name. For this program we will always open the first NCS device.
devices = mvnc.enumerate_devices()
# get the first NCS device by its name. For this program we will always open the first NCS device.
dev = mvnc.Device(devices[0])
# Read a compiled network graph from file (set the graph_filepath correctly for your graph file)
with open("graph", mode='rb') as f:
graphFileBuff = f.read()
graph = mvnc.Graph('graph1')
# Allocate the graph on the device and create input and output Fifos
in_fifo, out_fifo = graph.allocate_with_fifos(dev, graphFileBuff)
# Write the input to the input_fifo buffer and queue an inference in one call
graph.queue_inference_with_fifo_elem(in_fifo, out_fifo, input_img.astype('float32'), 'user object')
# Read the result to the output Fifo
output, userobj = out_fifo.read_elem()
print('Predicted:',output.argmax())
The Keras model is running on NCS now! You can call it here or further enhance the demo by adding a webcam to read live image and run on Raspberry Pi single board computer instead of an Ubuntu PC. Check out the video demo here.
Installing NCSDK2 on Pi may take dozens of minutes, that is not a bad news for those impatient. But the good news is, you can choose to install only the essential part of NCSDK2 on your Pi to run the inference with the graph compiled on your Ubuntu PC.
Instead of cloning the NCSDK2 repository to your Pi which could take quite a while, download a released version zip file of NCSDK2, this can save considerable disk size since all git version control files are skipped.
Secondly, skip TensorFlow and Caffe installation during the NCSDK2 installation by modifying ncsdk.conf
INSTALL_CAFFE=no
INSTALL_TENSORFLOW=no
Running a live webcam require OpenCV3 installed, run the following four lines in a terminal does the job for your Pi.
sudo pip3 install opencv-python==3.3.0.10
sudo apt-get update
sudo apt-get install libqtgui4
sudo apt-get install python-opencv
Once the NCSDK2 and OpenCV 3 installations are done, copy the graph file you generated into your Pi. Just remember since we skipped quite a lot stuff, the mvNC** command will not run on your Pi since they depend on Caffe and TensorFlow installation.
The MNIST model was trained to recognize handwritten digits of white color in the black background of a grayscale image with 28x28 resolution, to convert an image captured, some pre-processing step is necessary.
With that in mind, a similar implementation with Python OpenCV 3 can be found in file ImageProcessor.py. To wrap up the webcam demo, for each frame captured, we pass it to the image preprocess function then feed to NCS graph which returns the final prediction probabilities as before. From there, we present the finally predicted result as an overlay on the image showing on display.
Now you have deployed a Keras model to NCS. Keep in mind since NCS was built with the intention of "vision processing unit", it supports convolutional layers along with some others, while recurrent neural network layers like LSTM and GRU might not work on NCS. In our demo, we tell the mvNCCompile to take the final classification output node while it is possible to use an intermediate layer as an output node, in which sense using the model as a feature extractor, that is similar to how NCS's faceNet facial verification demo works.
Some useful resources,
Build a DIY security camera with neural compute stick series
Share on Twitter Share on Facebook
Comments