Autoencoders — Deep Learning bits #1

Featured: data compression, image reconstruction and segmentation (with examples!)

In the “Deep Learning bits” series, we will not see how to use deep learning to solve complex problems end-to-end as we do in A.I. Odyssey. We will rather look at different techniques, along with some examples and applications.

If you like Artificial Intelligence, make sure to subscribe to the newsletter to receive updates on articles and much more!

Introduction

What’s an autoencoder?

Neural networks exist in all shapes and sizes, and are often characterized by their input and output data type. For instance, image classifiers are built with Convolutional Neural Networks. They take images as inputs, and output a probability distribution of the classes.

Autoencoders (AE) are a family of neural networks for which the input is the same as the output*. They work by compressing the input into a latent-space representation, and then reconstructing the output from this representation.

*We’ll see how using altered versions of the input can be even more interesting

Simple Autoencoder architecture — The input is compressed and then reconstructed

Convolutional Autoencoders

A really popular use for autoencoders is to apply them to images. The trick is to replace fully connected layers by convolutional layers. These, along with pooling layers, convert the input from wide and thin (let’s say 100 x 100 px with 3 channels — RGB) to narrow and thick. This helps the network extract visual features from the images, and therefore obtain a much more accurate latent space representation. The reconstruction process uses upsampling and convolutions.

The resulting network is called a Convolutional Autoencoder (CAE).

Convolutional Autoencoder architecture — It maps a wide and thin input space to narrow and thick latent space

Reconstruction quality

The reconstruction of the input image is often blurry and of lower quality. This is a consequence of the compression during which we have lost some information.

The CAE is trained to reconstruct its input

The reconstructed image is blurry

Use of CAEs

Example 1: Ultra-basic image reconstruction

Convolutional autoencoders can be useful for reconstruction. They can, for example, learn to remove noise from picture, or reconstruct missing parts.

To do so, we don’t use the same image as input and output, but rather a noisy version as input and the clean version as output. With this process, the networks learns to fill in the gaps in the image.

Let’s see what a CAE can do to replace part of an image of an eye. Let’s say there’s a crosshair and we want to remove it. We can manually create the dataset, which is extremely convenient.

The CAE is trained to remove the crosshair

Even though it is blurry, the reconstructed input has no crosshair left

Now that our autoencoder is trained, we can use it to remove the crosshairs on pictures of eyes we have never seen!

Example 2: Ultra-basic image colorization

In this example, the CAE will learn to map from an image of circles and squares to the same image, but with the circles colored in red, and the squares in blue.

The CAE is trained to colorize the image

Even though the reconstruction is blurry, the color are mostly right

The CAE does pretty well on colorizing the right parts of the image. It has understood that circles are red and squares are blue. The purple color comes from a blend of blue and red where the networks hesitates between a circle and a square.

Now that our autoencoder is trained, we can use it to colorize pictures we have never seen before!

Advanced applications

The examples above are just proofs of concept to show what a convolutional autoencoder can do.

More exciting application include full image colorization, latent space clustering, or generating higher resolution images. The latter is obtained by using the low resolution as input and the high resolution as output.

Colorful Image Colorization by Richard Zhang, Phillip Isola, Alexei A. Efros

Neural Enhance by Alexjc

Conclusions

In this post, we have seen how we can use autoencoder neural networks to compress, reconstruct and clean data. Obtaining images as output is something really thrilling, and really fun to play with.

Note: there’s a modified version of AEs called Variational Autoencoders, which are used for image generation, but I keep that for later.

If you like Artificial Intelligence, make sure to subscribe to the newsletter to receive updates on articles and much more!

You can play with the code over there:

despoisj/ConvolutionalAutoencoder_ConvolutionalAutoencoder - Quick and dirty example of the application of convolutional autoencoders in Keras/Tensorflow_github.com

Thanks for reading this post, stay tuned for more !