Easy Real time gender age prediction from webcam video with Keras

(Comments)

crowd faces

Have you ever being in a situation to guess another person's age? Maybe This simple neural network model can do the work for you.

The demo you will be running in a second will take a live video stream from the WebCam and tag each face it found with the age and gender. Guess how cool it could be to place one such WebCam, let's say at your front door to get an overview of all visitors' age/gender statics.

age gender demo

I ran this model on my Windows PC with Python 3.5. It's possible to run on other OS as well.

How it works

Let's have an overview how it works in general.

pipeline

First, the photo is taken from the webcam stream live by the cv2 module.

# 0 means the default video capture device in OS
video_capture = cv2.VideoCapture(0)
# infinite loop, break by key ESC
while True:
    if not video_capture.isOpened():
        sleep(5)
    # Capture frame-by-frame
    ret, frame = video_capture.read()

Second, we turn the image to grayscale and use the cv2 module's CascadeClassifier class to detect faces in the image

gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(
    gray,
    scaleFactor=1.2,
    minNeighbors=10,
    minSize=(self.face_size, self.face_size)
)

The variable faces return by the detectMultiScale method is a list of detected face coordinates [x, y, w, h].

After known the faces' coordinates, we need to crop those faces before feeding to the neural network model.

We add the 40% margin to the face area so that the full head is included.

# placeholder for cropped faces
face_imgs = np.empty((len(faces), self.face_size, self.face_size, 3))
for i, face in enumerate(faces):
    face_img, cropped = self.crop_face(frame, face, margin=40, size=self.face_size)
    (x, y, w, h) = cropped
    cv2.rectangle(frame, (x, y), (x + w, y + h), (255, 200, 0), 2)
    face_imgs[i,:,:,:] = face_img

Then we are ready to feed those cropped faces to the model, it's as simple as calling the predict method.

For the age prediction, the output of the model is a list of 101 values associated with age probabilities ranging from 0~100, and all the 101 values add up to 1 (or what we call softmax). So we multiply each value with its associated age and sum them up resulting final predicted age.

if len(face_imgs) > 0:
    # predict ages and genders of the detected faces
    results = self.model.predict(face_imgs)
    predicted_genders = results[0]
    ages = np.arange(0, 101).reshape(101, 1)
    predicted_ages = results[1].dot(ages).flatten()

Last but not least we draw the result and render the image.

The gender prediction is a binary classification task. The model outputs value between 0~1, where the higher the value, the more confidence the model think the face is a male.

# draw results
for i, face in enumerate(faces):
    label = "{}, {}".format(int(predicted_ages[i]),
                            "F" if predicted_genders[i][0] > 0.5 else "M")
    self.draw_label(frame, (face[0], face[1]), label)
cv2.imshow('Keras Faces', frame)
if cv2.waitKey(5) == 27:  # ESC key press
    break

My complete source code as well as the link to download the pre-trained model weights is available in my GitHub repo.

Going deeper

For those not satisfied with the demo and have more understanding how the model is built and trained. This section is for you.

The datasets came from IMDB-WIKI – 500k+ face images with age and gender labels. Each image before feeding into the model we did the same preprocessing step shown above, detect the face and add margin.

The feature extraction part of the neural network uses the WideResNet architecture, short for Wide Residual Networks. It leverages the power of Convolutional Neural Networks (or ConvNets for short) to learn the features of the face. From less abstract features like edges and corners to more abstract features like eyes and mouth.

What unique of the WideResNet architecture is that the author decreased the depth and increased the width of original residual networks so it trained several times faster. Link to the paper here.

Further reading

The possibility of the model is endless, it really depends on what data you feed into it. Say you have lots of photos labeled by attractiveness, you can teach the model to tell the hotness of a person from the webcam live stream.

Here is a list of related projects, datasets for those curious.

Age/Gender detection in Tensorflow

IMDB-WIKI – 500k+ face images with age and gender labels

Data: Unfiltered faces for gender and age classification
Github: keras-vggface

Selfai: A Method for Understanding Beauty in Selfies

END

Current rating: 4.4

Comments