BigSnarf blog

Infosec FTW

Using Inception v3 Tensorflow for MNIST

Modern object recognition models have millions of parameters and can take weeks to fully train. Transfer learning is a technique that shortcuts a lot of this work by taking a fully-trained model for a set of categories like ImageNet, and retrains from the existing weights for new classes. In this example we’ll be retraining the final layer from scratch, while leaving all the others untouched. For more information on the approach you can see this paper on Decaf.

Though it’s not as good as a full training run, this is surprisingly effective for many applications, and can be run in as little as 75 minutes on a laptop, without requiring a GPU. The data I used is from Kaggle MNIST dataset.

Let’s reshape the train.csv data from Kaggle with this script to jpegs

Screen Shot 2016-07-19 at 8.24.47 PM

Script to convert train.csv to images in python


Let’s move the data to the proper folders


These are screenshots of the re-trained Inception v3 model

Screen Shot 2016-07-19 at 1.59.35 PM

Screen Shot 2016-07-19 at 2.02.52 PM


Re – Training ModelScreen Shot 2016-07-19 at 8.21.16 PM

Using the re-trained model to do MNIST prediction


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: