(Comments)
Ever wonder how paint colors are named? “Princess ivory”, “Bull cream.” And what about “Keras red”? It turns out that people are making a living naming those colors. In this post, I’m going to show you how to build a simple deep learning model to do something similar — give the model a color name as input, and have the model propose the name of the color.
This post is beginner friendly. I will introduce you to the basic concepts of processing text data with deep learning.
Overview
Let’s take a look at the big picture we’re going to build,
There are two general options for language modeling: word level models and character level models. Each has its own advantages and disadvantages. Let’s go through them now.
The word level language model can handle relatively long and clean sentences. By “clean”, I mean the words in the text datasets are free from typos and have few words outside of English vocabulary. The word level language model encodes each unique word into a corresponding integer, and there’s a predefined fixed-sized vocabulary dictionary to look up the word to integer mapping. One major benefit of the word level language model is its ability to leverage pre-trained word embeddings such as Word2Vec or GLOVE. These embeddings represent words as vectors with useful properties. Words close in context are close in Euclidean distance and can be used to understand analogies like “man is to women,
But there’s an even simpler language model, one that splits a text string into characters and associates a unique integer to every single character. There are some reasons you might choose to use the character level language model over the more popular word-level model:
You may also be aware of the limitation that came with adopting character level language:
Fortunately, these limitations won’t pose a threat to our color generation task. We’re limiting our color names to 25 characters in length and we only have 14157 training samples.
We mentioned that we’re limiting our color names to 25 characters. To arrive at this number we checked the distribution of the length of color names across all training samples and visualize it to make sure the length limit we pick makes sense.
h = sorted(names.str.len().as_matrix())
import numpy as np
import scipy.stats as stats
import pylab as plt
fit = stats.norm.pdf(h, np.mean(h), np.std(h)) #this is a fitting indeed
plt.plot(h,fit,'-o')
plt.hist(h,normed=True) #use this to draw histogram of your data
plt.xlabel('Chars')
plt.ylabel('Probability density')
plt.show()
That gives us this plot, and you can clearly see that the majority of the color name strings has lengths less or equal to 25, even though the max length goes up to 30.
We could in our case pick the max length of 30, but the model we’re going to build will also need to be trained on longer sequences for an extended time. Our trade-off to pick shorter sequence length reduces the model training complexity while not compromising the integrity of the training data.
With the tough decision of max length being made, the next step in the character level data pre-processing is to transform each color name string to a list of 25 integer values, and this was made easy with the Keras text tokenization utility.
from tensorflow.python.keras.preprocessing.text import Tokenizer
maxlen = 25
t = Tokenizer(char_level=True)
t.fit_on_texts(names)
tokenized = t.texts_to_sequences(names)
padded_names = preprocessing.sequence.pad_sequences(tokenized, maxlen=maxlen)
Right now padded_names will have the shape of (14157, 25), where 14157 is the number of total training samples and 25 being the max sequence length. If a string has less than 25 characters, it will be padded with the
You might be thinking, all inputs are now in the form of integers, and our model should be able to process it. But there is one more step we can take to make later model training more effective.
We can view the character to integer mapping by inspecting the t.word_indexproperty of the instance of Keras’ Tokenizer.
{' ': 4, 'a': 2, 'b': 18, 'c': 11, 'd': 13, 'e': 1, 'f': 22, 'g': 14, 'h': 16, 'i': 5, 'j': 26, 'k': 21, 'l': 7, 'm': 17, 'n': 6, 'o': 8, 'p': 15, 'q': 25, 'r': 3, 's': 10, 't': 9,'u': 12, 'v': 23, 'w': 20, 'x': 27, 'y': 19, 'z': 24}
The integer values have no natural ordered relationship between each other and our model may not be able to harness any benefit from it. What’s worse, our model will initially assume such an ordering relationship among those characters (i.e. “a” is 2 and “e” is 1 but that should not signify a relationship), which can lead to an unwanted result. We will use one-hot encoding to represent the input sequence.
Each integer will be represented by a boolean array where only one element in the array will have a value of 1. The max integer value will determine the length of the boolean array in the character dictionary.
In our case, the max integer value is ‘x’: 27, so the length of a one-hot boolean array will be 28(considering the lowest value starts with 0, which is the padding).
For example, instead of using the integer value 2 to represent character ‘a’, we’re going to use one-hot array [0, 0, 1, 0 …….. 0].
One-hot encoding is also accessible in Keras.
from keras.utils import np_utils
one_hot_names = np_utils.to_categorical(padded_names)
The resulting one_hot_names has the shape (14157, 25, 28), which stands for (# of training samples, max sequence length, # of unique tokens)
Remember we’re predicting 3 color channel values, each value ranging between 0–255. There is no golden rule for data normalization. Data normalization is purely practical because in practice it could take a model forever to converge if the training data values are spread out too much. A common normalization technique is to scale values to [-1, 1]. In our
# The RGB values are between 0 - 255
# scale them to be between 0 - 1
def norm(value):
return value / 255.0
normalized_values = np.column_stack([norm(data["red"]), norm(data["green"]), norm(data["blue"])])
To build our model we’re going to use two types of neural networks, a
In recurrent neural
The easiest way to build up a deep learning model in Keras is to use its sequential API, and we simply connect each of the neural network layers by calling its model.add() function like connecting LEGO bricks.
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.layers import Dense, Dropout, LSTM, Reshape
model = Sequential()
model.add(LSTM(256, return_sequences=True, input_shape=(maxlen, 28)))
model.add(LSTM(128))
model.add(Dense(128, activation='relu'))
model.add(Dense(3, activation='sigmoid'))
model.compile(optimizer='adam', loss='mse', metrics=['acc'])
Training a model cannot be any easier by calling model.fit() function. Notice that we’re reserving 10% of the samples for validation purpose. If it turns out the model is achieving great accuracy on the training set but much lower on the validation set, it’s likely the model is overfitting. You can get more information about dealing with overfitting on my other blog: Two Simple Recipes for Over Fitted Model.
history = model.fit(one_hot_names, normalized_values,
epochs=40,
batch_size=32,
validation_split=0.1)
Let’s define some functions to generate and show the color predicted.
For a color name input, we need to transform it into the same one-hot representation. To achieve this, we tokenize characters to integers with the same tokenizer with which we processed the training data, pad it to the max sequence length of 25, then apply the one-hot encoding to the integer sequence.
And for the output RGB values, we need to scale it back to 0–255, so we can display them correctly.
# plot a color image
def plot_rgb(rgb):
data = [[rgb]]
plt.figure(figsize=(2,2))
plt.imshow(data, interpolation='nearest')
plt.show()
def scale(n):
return int(n * 255)
def predict(name):
name = name.lower()
tokenized = t.texts_to_sequences([name])
padded = preprocessing.sequence.pad_sequences(tokenized, maxlen=maxlen)
one_hot = np_utils.to_categorical(padded, num_classes=28)
pred = model.predict(np.array(one_hot))[0]
r, g, b = scale(pred[0]), scale(pred[1]), scale(pred[2])
print(name + ',', 'R,G,B:', r,g,b)
plot_rgb(pred)
Let's give the predict() function a try.
predict("tensorflow orange")
predict("forest")
predict("keras red")
“keras red” looks a bit darker than one we’re familiar with, but anyway, that was the model proposed.
In this post, we talked about how to build a Keras model that can take any color name and come up with an RGB color value. More specifically, we looked at how to apply the one-hot encoding to character level language models, building a neural network model with a
Here’s a diagram to summarize what we have built in the post, starting from the bottom and showing every step of the data flow.
If you’re new to deep learning or the Keras library, there are some great resources that are easy and fun to read or experiment with.
TensorFlow playground: an interactive visualization of neural networks run on your browser.
Coursera deep learning course: learn the foundations of deep learning and lots of practical advice.
Keras get started guide: the official guide for the user-friendly, modular deep Python deep learning library.
Also, check out the source code for this post in my GitHub repo.
Share on Twitter Share on Facebook
Comments