May 10, 2018

Transfer Learning and Data Augmentation in Keras

This Keras example demonstrates how to setup transfer learning while also using data augmentation.

Introduction

Data augmentation is widely used to prevent over fitting in machine learning models by effectively increasing the amount of training data available to train our models.

Transfer learning is also widely used to speed up the training phase by using a pre-trained model as a starting point, avoiding the time and expense of training a model from scratch.

Although there are plenty of examples of both transfer learning and data augmentation in Keras, there are very few examples showing how to use both together.

This Keras example demonstrates how to setup transfer learning while also using data augmentation.

Extracting Bottleneck Features

In this step we want to extract the Bottleneck Features from InceptionV3 Model.

As usual we start off with importing the required Python libraries:

import numpy as np
import keras as keras
from keras.models import Sequential
from keras.layers import GlobalAveragePooling2D, Dense
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ModelCheckpoint
from keras.utils.np_utils import to_categorical
import math
import utility
from PIL import ImageFile
# address a bug when using ImageDataGenerator and the InceptionV3 model
ImageFile.LOAD_TRUNCATED_IMAGES = True

We then load the InceptionV3 model with the final fully-connected layers removed:

from keras.applications.inception_v3 import InceptionV3
model = InceptionV3(include_top=False)

We configure two data generators, one to augment images which will be used for training and validation and one without augmentation which will be used for testing:

augmentation_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        rotation_range=40,
        width_shift_range=0.2,
        height_shift_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

non_augmentation_datagen = ImageDataGenerator(rescale=1./255)

batch_size = 32

We use the configured data generators to run images through the truncated InceptionV3 model:

# run images from training dataset through the model
train_generator = augmentation_datagen.flow_from_directory(
        'images/train',
        target_size=(224, 224),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

number_of_train_samples = len(train_generator.filenames)
number_of_train_steps = int(math.ceil(number_of_train_samples / float(batch_size)))

train_bottleneck_features = model.predict_generator(train_generator, number_of_train_steps)

# run images from test validation through the model
valid_generator = augmentation_datagen.flow_from_directory(
        'images/valid',
        target_size=(224, 224),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

number_of_valid_samples = len(valid_generator.filenames)
number_of_valid_steps = int(math.ceil(number_of_valid_samples / float(batch_size)))

valid_bottleneck_features = model.predict_generator(valid_generator, number_of_valid_steps)

# run images from test dataset through the model
test_generator = non_augmentation_datagen.flow_from_directory(
        'images/test',
        target_size=(224, 224),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

number_of_test_samples = len(test_generator.filenames)
number_of_test_steps = int(math.ceil(number_of_test_samples / float(batch_size)))

test_bottleneck_features = model.predict_generator(test_generator, number_of_test_steps)

Finally we save the InceptionV3 bottleneck features to an .npz file:

np.savez('inceptionv3_bottleneck_features.npz', train=train_bottleneck_features, valid=valid_bottleneck_features, test=test_bottleneck_features)

Building and Training the Model

In this step we build and train a new simple model using InceptionV3 bottleneck features.

Similar to the previous step we need to first configure data generators for training, validation and testing:

# load the bottleneck features
bottleneck_features = np.load('inceptionv3_bottleneck_features.npz')

# configure training data generator
train_generator = augmentation_datagen.flow_from_directory(
        'images/train',
        target_size=(224, 224),
        batch_size=batch_size,
        class_mode='categorical',
        shuffle=False)

number_of_train_samples = len(train_generator.filenames)
number_of_train_classes = len(train_generator.class_indices)
number_of_train_steps = int(math.ceil(number_of_train_samples / float(batch_size)))

# get the training bottleneck features and class labels
train_data = bottleneck_features['train']
train_labels = train_generator.classes
train_labels = to_categorical(train_labels, num_classes=number_of_train_classes)

# configure validation data generator
valid_generator = augmentation_datagen.flow_from_directory(
        'images/valid',
        target_size=(224, 224),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

number_of_valid_samples = len(valid_generator.filenames)
number_of_valid_classes = len(valid_generator.class_indices)
number_of_valid_steps = int(math.ceil(number_of_valid_samples / float(batch_size)))

# get the validation bottleneck features and class labels
valid_data = bottleneck_features['valid']
valid_labels = valid_generator.classes
valid_labels = to_categorical(valid_labels, num_classes=number_of_valid_classes)

# configure testing data generator
test_generator = non_augmentation_datagen.flow_from_directory(
        'images/test',
        target_size=(224, 224),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

number_of_test_samples = len(test_generator.filenames)
number_of_test_classes = len(test_generator.class_indices)
number_of_test_steps = int(math.ceil(number_of_test_samples / float(batch_size)))

# get the testing bottleneck features and class labels
test_data = bottleneck_features['test']
test_labels = test_generator.classes
test_labels = to_categorical(test_labels, num_classes=number_of_test_classes)

At this point we have everything we need to create and train a new simple model which uses the bottleneck features from the more complex InceptionV3 model:

# define the a new model
new_model = Sequential()
new_model.add(GlobalAveragePooling2D(input_shape=train_data.shape[1:]))

# add fully connected layer
new_model.add(Dense(number_of_train_classes, activation='softmax'))

# compile the model
new_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

# train the new model:
checkpointer = ModelCheckpoint(filepath='saved_models/weights.best.InceptionV3.hdf5', verbose=1, save_best_only=True)

history = new_model.fit(train_data, train_labels, 
          validation_data=(valid_data, valid_labels),
          epochs=10, batch_size=batch_size, callbacks=[checkpointer], verbose=1)

Making Predictions

We can now use the newly trained model to make predictions:

# load the model weights with the best validation loss
new_model.load_weights('saved_models/weights.best.InceptionV3.hdf5')

# get index of predicted dog breed for each image in test set
predictions = [np.argmax(new_model.predict(np.expand_dims(feature, axis=0))) for feature in test_data]

# report test accuracy
test_accuracy = 100*np.sum(np.array(predictions)==np.argmax(test_labels, axis=1))/len(predictions)
print('Test accuracy: %.4f%%' % test_accuracy)

References

The following resources were used in developing this project: