TL; DR. This post will get you started with the Keras deep learning framework without installation hassles. I will show you how easy it is to run your code on the cloud for free.
I know there are lots of tutorials out there to get you up and running with deep learning in Keras. They normally go with an image classifier for the MNIST handwritten digits or cat/dog classification. Here I wanted to take a different but maybe a more interesting approach by showing you how to build a model that can recommend a place you might be interested in given a source image you like.
Let's get started!
If you come from a background of programming in general, you might have once be suffered from the pain when you were first starting something new.
Installing an IDE, library dependencies, hardware driver support... And they might have cost you a lot of time before the first successful run of "Hello World!".
Deep learning is the same, it depends on lots of things to make the model working. For example, in order to have a deep learning model to train and run faster, you need a graphics card. Which could easily take several hours for beginners to setup, let alone that you would have to choose and purchase the graphics card itself which can be quite costly.
Today you can eliminate the initial learning curve of deep learning. It is now possible to run your code entirely in the cloud with all necessary dependencies pre-installed for you. More importantly, you can run your model faster on a graphics card for free.
At this point, I'd like to introduce Google Colab since I found it very useful to share my deep learning code with others where they can reproduce the result in a matter of seconds.
All you need is a Gmail account and an internet connection. The heavy lifting computation will be handled by Google Colab servers.
You will need to get comfortable with Jupyter notebook environment on Colab. Which is quite easy, you tap the play button at the left side of a cell to run the code inside. You can run a cell multiple times if you want.
Let's supercharge the running speed of a deep learning model by activating the GPU on colab.
Click on the "Runtime" menu button, then "Change runtime type", choose GPU in the "Hardware accelerator" dropdown list.
We are ready for the journey! Now buckle up since we are going to enter the wild west of the deep learning world.
The model we are introducing can tell which places an image contains.
Or described more formally, the input of the model is the image data and the output will be a list of places with different probabilities. The higher the probability, the more likely the image contains the corresponding scene/place.
The model can classify 365 different places, including coffee shop, museum, outdoor etc.
Here is Colab notebook for this tutorial, you can experiment with it while reading this article. Keras_travel_place_recommendation-part1.ipynb
The most important building block of our model is the convolutional network which will play the role of extracting image features. From more general low-level features like edges/corners to more domain specific high-level features like patterns and parts.
The model will have several blocks of convolutional networks stacked one over another. The deeper the convolutional layer, the more abstract and higher level features it extracts.
Here are an images showing the idea.
Now enough with the intuition of convolutional network. Let's get our hands dirty by building a model to make it happen.
It is really easy to build a custom deep learning model with Keras framework. Keras is designed for human beings, not machines. It is also an official high-level API for the most popular deep learning library - TensorFlow. If you just get started and look for a deep learning framework. Keras is the right choice.
Don't panic if it is your first time seeing a Keras model code below. Actually, it is quite simple to understand. The model has several blocks of convolutional layers. Each block as we explained earlier extract different levels of image features. For example "Block 1" being at the input level, it extracts entry-level features like edges and corners. The deeper it goes, the more abstract features each block extracts. You also noticed the final classification block formed by two fully connected Dense layers, they are responsible for making a final prediction.
from keras.models import Sequential from keras.layers.core import Flatten, Dense, Dropout from keras.layers.convolutional import Conv2D, MaxPooling2D from keras.optimizers import SGD model = Sequential() # Block 1 model.add(Conv2D(64, (3, 3) ,input_shape=(3,224,224), activation='relu', padding='same', name='block1_conv1')) model.add(Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')) model.add(MaxPooling2D((2,2), strides=(2,2))) # Block 2 model.add(Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')) model.add(Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')) model.add(MaxPooling2D((2,2), strides=(2,2))) # Block 3 model.add(Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')) model.add(Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')) model.add(Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')) model.add(MaxPooling2D((2,2), strides=(2,2))) # Block 4 model.add(Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')) model.add(Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')) model.add(Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')) model.add(MaxPooling2D((2,2), strides=(2,2))) # Block 5 model.add(Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')) model.add(Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')) model.add(Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')) model.add(MaxPooling2D((2,2), strides=(2,2))) # Classification block model.add(Flatten()) model.add(Dense(4096, activation='relu', name='fc1')) model.add(Dropout(0.5)) model.add(Dense(4096, activation='relu', name='fc2')) model.add(Dropout(0.5)) model.add(Dense(365, activation='softmax')) # Load pre-trained model weights parameters. model.load_weights('models/places/places_vgg_keras.h5') sgd = SGD(lr=0.1, decay=1e-6, momentum=0.9, nesterov=True) model.compile(optimizer=sgd, loss='categorical_crossentropy')
Let's have the model predict labels for an image.
The model expects a fixed shape of image input which is 244 x 244 pixels with three color channels(RGB). But what if we have another image with different resolution? Keras has some helper functions come in handy.
The code below turns an image into the data array, followed by some data value normalization before feeding to the model.
from keras.preprocessing import image from keras.applications.imagenet_utils import preprocess_input img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x)
Then we feed the processed array of shape (3, 244, 244) to the model, the 'preds' variable is a list of 365 floating point numbers corresponding to 365 places/scenes.
Take the top 5 predictions and map their indexes to the actual names of places/scenes.
preds = model.predict(x) top_preds = np.argsort(preds)[::-1][0:5] results =  for x in top_preds: results.append(labels[x]) print(results)
And here is the result.
['beach', 'lagoon', 'coast', 'ocean', 'islet']
Feel free to try with other images.
We have learned how easy it is to get a deep learning model that predicts places/scenes up and running quickly with Google Colab. Read the second part of the tutorial, I am going to show you how to extract raw features from images and use that to build a travel recommendation engine.
At the meanwhile check out some resources that might be helpful.
If you want to upload your custom images to Colab, read the section "Predict with Custom Images" in one of my previous posts.Share on Twitter Share on Facebook