Auto-encoder & Classifier TensorFlow Models for Digit Classification

Part 2: TensorFlow neural network implementation and training for classifying MNIST handwritten images.

Andrew Didinchuk
6 min readJan 19, 2021

This post will be covering the two models that were set up in TensorFlow to process MNIST digit data, how training was conducted, and finally how the results were converted into a tangible model to be leveraged downstream. This post is part of the TensorFlow + Docker MNIST Classifier series.

I will not be covering the basics of TensorFlow in these posts. Typically I am not a huge fan of programming literature myself with the massive amount of resources available online, however for learning TensorFlow I highly recommend this e-book for grasping the fundamentals.

The Data Set (MNIST): This is one of the most popular machine learning data sets on the internet at the moment. It consists of tens of thousands of 28 x 28 labeled handwritten digits like the one below.

some sample images from the MNIST data set

One of the key success criteria for this project was the use of multiple models in the final solution. The first model will be an auto-encoder to standardize the image data and the second model will classify it.

Features and targets

The features or digits will be passed through the model as 784-dimensional vectors with each element of the vector representing pixel intensity (white to black) of each pixel in the 28 x 28 image. Scaling was used on the feature data to improve performance converting the value range from [0.0, 255.0] to [0.0, 1.0] by dividing each value by 255.

Data set labels (targets) are a single dimensional vector with values ranging from 0–9, representing the 10 potential digit classes. In order to improve model performance and simplicity these were transformed into a 10 dimensional one-hot representation with each dimension representing the probability of the associated digit e.g. [5] -> [0, 0, 0, 0, 0, 1, 0, 0, 0, 0].

The model

One of the goals of this project was to implement a system with 2 models and I chose to use an auto-encoder as my first model and a basic classifier for my second as illustrate below.

2 model structure

Keras is used to simplify development and training, config files are used to store hyperparameters and file paths, and I developed a basic helper for loading MNIST image data as I am not using Keras for data loading.

SUPPORTING PYTHON OBJECTS

  1. Config — Configuration file for the training
  2. MNISTProcessor — MNIST data loader
  3. DataWrapper — Object to handle training and testing data
  4. Visualizer — Stored functions to help visualize results

The Auto-encoder: There is a lot of great material on the auto-encoder network online including the wiki entry here. In a nutshell, an auto-encoder is an unsupervised symmetrical neural network that compresses the feature vector into significantly fewer dimensions. The network is trained by using features as both the input and output of the network, teaching the filters to compress the features. One of the key uses of the auto-encoder is noise reduction and this is what it will be used for here.

Typical auto-encoder
Tanh MSE Adadelta

Using Keras we can implement the neural network using the following code.

To break down the code a little
lines 10–13 — using TensorFlow flags to pull command line argument values
lines 21–23 — process the MNIST data set into features and labels
lines 26–37 — set up the neural network structure and optimizer
lines 40 — set up a callback for saving checkpoints during training
lines 43–47 — load any existing checkpoints
lines 50–54 — train the model
lines 57–62 — save a production version that will be ready for serving
lines 64–68 — display final model structure and some sample autoencoding

The following function was created to help visualize the auto-encoder result.

this function can be called like so: visualize_autoencoding(x_data_train, clean_images, digits_to_show=4) from our training program after the training is completed.

After training my error loss was around 0.025 and you can see below what a few sample images looked like after being passed through the trained auto-encoder. The result could be improved but this should be satisfactory for our needs.

The Classifier: The second model will take the 784-dimensional vector output by the auto-encoder and classifying the data into one of the 10 possible digit values [0, 9]. A simple tanh activated deep neural network will be used.

Tanh MSE Adadelta

Keras was used to implement the classifier as well. We first load and process the image data through the auto-encoder before using it as the feature input for the training of the classifier.

To break down the code a little
lines 11–15 — using TensorFlow flags to pull command line argument values
lines 22–24 — process the MNIST data set into features and labels
lines 27–28 — load the autoencoder model and process the feature data set
lines 31–38 — set up the neural network structure and optimizer
lines 41 — set up a callback for saving checkpoints during training
lines 45–49 — load any existing checkpoints
lines 51–55 — train the model
lines 57–62 — save a production version that will be ready for serving
lines 65 — display final model structure

Lessons learned

Poor initial model convergence — I wrote my initial code using the stand-alone Keras library, however, due to challenges of saving the models in a servable format I had to switch the tf.keras library instead. After my switch, my models flat out refused to converge during training. After many hours of debugging, I discovered that the keras.optimizers.Adadelta optimizer uses a default starting learning rate of 1.0, whereas the tf.keras.optimizersAdadelta optimizer initializes with a learning rate of 0.001. Forcing the learning rate addressed this issue for me and you can see this reflected in my code.

For the lazy

My results can be reproduced with the following commands:

ou should now see the production models under models/autoencoder/production/1 and models/classifier/production/1 that looks like this:

The entire TensorFlow GitHub repository along with complete instructions on running the model can be found here. Now that we have both the auto-encoder and classifier models generated we can take a look at deploying them via TensorFlow serving, which I will do in my next post.

Here is a summary of the components involved in this project:

  1. Introduction
  2. The Models : git repo → tf-mnist-project
  3. Serving Models : git repo → tf-serving-mnist-project
  4. The User Interface : git repo → angular-mnist-project

--

--

Andrew Didinchuk
Andrew Didinchuk

Written by Andrew Didinchuk

Serial tinkerer and digital architect

No responses yet