Modern object recognition models have millions of parameters and can take weeks to fully train. Transfer learning is a technique that shortcuts a lot of this work by taking a fully-trained model for a set of categories like ImageNet, and retrains from the existing weights for new classes. In this example we’ll be retraining the final layer from scratch, while leaving all the others untouched. For more information on the approach you can see this paper on Decaf.
Though it’s not as good as a full training run, this is surprisingly effective for many applications, and can be run in as little as 75 minutes on a laptop, without requiring a GPU. The data I used is from Kaggle MNIST dataset.
Let’s reshape the train.csv data from Kaggle with this script to jpegs
Script to convert train.csv to images in python
Let’s move the data to the proper folders
These are screenshots of the re-trained Inception v3 model
Re – Training Model
Using the re-trained model to do MNIST prediction