Lesson 08 - CNN for the MNIST Dataset

The following topics are discussed in this notebook:

  • Using a neural network for classifying image data.
  • Scaling features.
In [1]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import pandas as pd

import keras
from keras.models import Sequential
from keras.layers import Dense, Flatten
from tensorflow import set_random_seed

from sklearn.model_selection import train_test_split
from keras.datasets import mnist
Using TensorFlow backend.

Load the MNIST Data

The MNIST dataset consists of 70,000, 28x28 black-and-white images of handwritten digits.

In [2]:
(X_train, y_train), (X_holdout, y_holdout) = mnist.load_data()

X_val, X_test, y_val, y_test = train_test_split(X_holdout, y_holdout, test_size = 0.5, random_state=1)

print(X_train.shape)
print(y_train.shape)
print(X_val.shape)
print(y_val.shape)
print(X_test.shape)
print(y_test.shape)
(60000, 28, 28)
(60000,)
(5000, 28, 28)
(5000,)
(5000, 28, 28)
(5000,)

How is a digit represented?

In [3]:
mydigit = X_train[0]
plt.imshow(mydigit, cmap=cm.binary)
plt.axis('off')
plt.show()

Scale the Data

In [4]:
Xs_train = X_train / 255
Xs_val = X_val / 255
Xs_test = X_test / 255

Training the network.

In [5]:
%%time

np.random.seed(1)
set_random_seed(1)

model = Sequential()
model.add(Flatten(input_shape=(28,28)))
model.add(Dense(512, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(10, activation='softmax'))

opt = keras.optimizers.Adam(lr = 0.001)
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

h = model.fit(Xs_train, y_train, batch_size=1024, epochs=20, validation_data=(Xs_val, y_val), verbose=2)
Train on 60000 samples, validate on 5000 samples
Epoch 1/20
 - 2s - loss: 0.5188 - acc: 0.8607 - val_loss: 0.2156 - val_acc: 0.9340
Epoch 2/20
 - 1s - loss: 0.1752 - acc: 0.9492 - val_loss: 0.1384 - val_acc: 0.9598
Epoch 3/20
 - 1s - loss: 0.1198 - acc: 0.9649 - val_loss: 0.1104 - val_acc: 0.9656
Epoch 4/20
 - 1s - loss: 0.0864 - acc: 0.9748 - val_loss: 0.0886 - val_acc: 0.9740
Epoch 5/20
 - 1s - loss: 0.0655 - acc: 0.9811 - val_loss: 0.0806 - val_acc: 0.9748
Epoch 6/20
 - 1s - loss: 0.0502 - acc: 0.9854 - val_loss: 0.0741 - val_acc: 0.9770
Epoch 7/20
 - 1s - loss: 0.0402 - acc: 0.9888 - val_loss: 0.0705 - val_acc: 0.9792
Epoch 8/20
 - 1s - loss: 0.0319 - acc: 0.9912 - val_loss: 0.0635 - val_acc: 0.9792
Epoch 9/20
 - 1s - loss: 0.0256 - acc: 0.9933 - val_loss: 0.0674 - val_acc: 0.9800
Epoch 10/20
 - 1s - loss: 0.0212 - acc: 0.9949 - val_loss: 0.0632 - val_acc: 0.9814
Epoch 11/20
 - 1s - loss: 0.0155 - acc: 0.9966 - val_loss: 0.0631 - val_acc: 0.9810
Epoch 12/20
 - 1s - loss: 0.0124 - acc: 0.9974 - val_loss: 0.0668 - val_acc: 0.9806
Epoch 13/20
 - 1s - loss: 0.0099 - acc: 0.9982 - val_loss: 0.0655 - val_acc: 0.9810
Epoch 14/20
 - 1s - loss: 0.0076 - acc: 0.9989 - val_loss: 0.0642 - val_acc: 0.9810
Epoch 15/20
 - 1s - loss: 0.0062 - acc: 0.9992 - val_loss: 0.0626 - val_acc: 0.9824
Epoch 16/20
 - 1s - loss: 0.0048 - acc: 0.9995 - val_loss: 0.0672 - val_acc: 0.9806
Epoch 17/20
 - 1s - loss: 0.0053 - acc: 0.9991 - val_loss: 0.0678 - val_acc: 0.9826
Epoch 18/20
 - 1s - loss: 0.0037 - acc: 0.9996 - val_loss: 0.0700 - val_acc: 0.9816
Epoch 19/20
 - 1s - loss: 0.0029 - acc: 0.9998 - val_loss: 0.0667 - val_acc: 0.9816
Epoch 20/20
 - 1s - loss: 0.0021 - acc: 0.9999 - val_loss: 0.0666 - val_acc: 0.9826
Wall time: 14.5 s

Visualize Training Process

In [6]:
plt.rcParams["figure.figsize"] = [8,4]

plt.subplot(1,2,1)
plt.plot(h.history['loss'], label='Training')
plt.plot(h.history['val_loss'], label='Validation')
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend()

plt.subplot(1,2,2)
plt.plot(h.history['acc'], label='Training')
plt.plot(h.history['val_acc'], label='Validation')
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend()

plt.show()
In [7]:
score = model.evaluate(X_test, y_test, verbose=0)
print('Testing loss:    ', score[0])
print('Testing accuracy:', score[1])
Testing loss:     0.343288713836448
Testing accuracy: 0.9786

Convolutional Neural Network

In [8]:
import tensorflow as tf
from keras.layers import Conv2D, MaxPooling2D
import keras
In [9]:
np.random.seed(1)
tf.set_random_seed(1)

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(28,28,1), activation='relu'))
model.add(MaxPooling2D((2,2)))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2)))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2)))

model.add(Flatten())

model.add(Dense(64, activation='relu'))
model.add(Dense(10, activation='softmax'))


opt = keras.optimizers.Adam(lr=0.001)
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

model.summary()

h = model.fit(Xs_train.reshape(-1,28,28,1), y_train,
              batch_size=256, epochs=20, verbose=2, 
              validation_data = (Xs_val.reshape(-1,28,28,1), y_val))
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_1 (Conv2D)            (None, 28, 28, 32)        320       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 32)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 12, 12, 64)        18496     
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 6, 6, 64)          0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 4, 4, 64)          36928     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 2, 2, 64)          0         
_________________________________________________________________
flatten_2 (Flatten)          (None, 256)               0         
_________________________________________________________________
dense_4 (Dense)              (None, 64)                16448     
_________________________________________________________________
dense_5 (Dense)              (None, 10)                650       
=================================================================
Total params: 72,842
Trainable params: 72,842
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 5000 samples
Epoch 1/20
 - 8s - loss: 0.3771 - acc: 0.8974 - val_loss: 0.0811 - val_acc: 0.9750
Epoch 2/20
 - 2s - loss: 0.0718 - acc: 0.9783 - val_loss: 0.0494 - val_acc: 0.9862
Epoch 3/20
 - 2s - loss: 0.0492 - acc: 0.9854 - val_loss: 0.0496 - val_acc: 0.9840
Epoch 4/20
 - 2s - loss: 0.0380 - acc: 0.9882 - val_loss: 0.0326 - val_acc: 0.9888
Epoch 5/20
 - 2s - loss: 0.0313 - acc: 0.9902 - val_loss: 0.0306 - val_acc: 0.9912
Epoch 6/20
 - 2s - loss: 0.0256 - acc: 0.9922 - val_loss: 0.0275 - val_acc: 0.9912
Epoch 7/20
 - 2s - loss: 0.0215 - acc: 0.9934 - val_loss: 0.0301 - val_acc: 0.9906
Epoch 8/20
 - 2s - loss: 0.0187 - acc: 0.9938 - val_loss: 0.0251 - val_acc: 0.9922
Epoch 9/20
 - 2s - loss: 0.0146 - acc: 0.9953 - val_loss: 0.0280 - val_acc: 0.9910
Epoch 10/20
 - 2s - loss: 0.0146 - acc: 0.9952 - val_loss: 0.0279 - val_acc: 0.9924
Epoch 11/20
 - 2s - loss: 0.0122 - acc: 0.9963 - val_loss: 0.0302 - val_acc: 0.9908
Epoch 12/20
 - 2s - loss: 0.0108 - acc: 0.9965 - val_loss: 0.0332 - val_acc: 0.9904
Epoch 13/20
 - 2s - loss: 0.0101 - acc: 0.9967 - val_loss: 0.0216 - val_acc: 0.9936
Epoch 14/20
 - 2s - loss: 0.0084 - acc: 0.9975 - val_loss: 0.0311 - val_acc: 0.9910
Epoch 15/20
 - 2s - loss: 0.0085 - acc: 0.9972 - val_loss: 0.0236 - val_acc: 0.9926
Epoch 16/20
 - 2s - loss: 0.0067 - acc: 0.9978 - val_loss: 0.0307 - val_acc: 0.9912
Epoch 17/20
 - 2s - loss: 0.0063 - acc: 0.9979 - val_loss: 0.0388 - val_acc: 0.9880
Epoch 18/20
 - 2s - loss: 0.0072 - acc: 0.9975 - val_loss: 0.0292 - val_acc: 0.9922
Epoch 19/20
 - 2s - loss: 0.0057 - acc: 0.9981 - val_loss: 0.0325 - val_acc: 0.9930
Epoch 20/20
 - 2s - loss: 0.0076 - acc: 0.9975 - val_loss: 0.0310 - val_acc: 0.9918

Uncentered Images

In [10]:
np.random.seed(1)
Z_train = np.zeros((60000, 56, 56))
Z_val = np.zeros((5000, 56, 56))
Z_test = np.zeros((5000, 56, 56))

for i in range(60000):
  ho, vo = np.random.choice(range(28)), np.random.choice(range(28))
  Z_train[i, vo:(vo+28), ho:(ho+28)] = X_train[i]
  
for i in range(5000):
  ho, vo = np.random.choice(range(28)), np.random.choice(range(28))
  Z_val[i, vo:(vo+28), ho:(ho+28)] = X_val[i]
  
for i in range(5000):
  ho, vo = np.random.choice(range(28)), np.random.choice(range(28))
  Z_test[i, vo:(vo+28), ho:(ho+28)] = X_test[i]
In [11]:
n = np.random.choice(range(6000))
plt.imshow(Z_train[n,:,:], cmap=cm.binary)
plt.axis('off')
plt.show()
In [12]:
Zs_train = Z_train / 255
Zs_val = Z_val / 255
Zs_test = Z_test / 255

ANN

In [13]:
%%time

np.random.seed(1)
set_random_seed(1)

model = Sequential()
model.add(Flatten(input_shape=(56,56)))
model.add(Dense(512, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(10, activation='softmax'))

model.summary()

opt = keras.optimizers.Adam(lr = 0.001)
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

h = model.fit(Zs_train, y_train, batch_size=1024, epochs=20, validation_data=(Zs_val, y_val), verbose=2)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten_3 (Flatten)          (None, 3136)              0         
_________________________________________________________________
dense_6 (Dense)              (None, 512)               1606144   
_________________________________________________________________
dense_7 (Dense)              (None, 256)               131328    
_________________________________________________________________
dense_8 (Dense)              (None, 10)                2570      
=================================================================
Total params: 1,740,042
Trainable params: 1,740,042
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 5000 samples
Epoch 1/20
 - 2s - loss: 1.6938 - acc: 0.4330 - val_loss: 1.0398 - val_acc: 0.6722
Epoch 2/20
 - 2s - loss: 0.7487 - acc: 0.7765 - val_loss: 0.6072 - val_acc: 0.8162
Epoch 3/20
 - 2s - loss: 0.4475 - acc: 0.8710 - val_loss: 0.5051 - val_acc: 0.8462
Epoch 4/20
 - 2s - loss: 0.3152 - acc: 0.9089 - val_loss: 0.4339 - val_acc: 0.8624
Epoch 5/20
 - 2s - loss: 0.2319 - acc: 0.9345 - val_loss: 0.4505 - val_acc: 0.8642
Epoch 6/20
 - 2s - loss: 0.1714 - acc: 0.9531 - val_loss: 0.4292 - val_acc: 0.8692
Epoch 7/20
 - 2s - loss: 0.1280 - acc: 0.9674 - val_loss: 0.4235 - val_acc: 0.8770
Epoch 8/20
 - 2s - loss: 0.0915 - acc: 0.9789 - val_loss: 0.4090 - val_acc: 0.8814
Epoch 9/20
 - 2s - loss: 0.0651 - acc: 0.9872 - val_loss: 0.4208 - val_acc: 0.8838
Epoch 10/20
 - 2s - loss: 0.0458 - acc: 0.9928 - val_loss: 0.4426 - val_acc: 0.8802
Epoch 11/20
 - 2s - loss: 0.0334 - acc: 0.9958 - val_loss: 0.4338 - val_acc: 0.8860
Epoch 12/20
 - 2s - loss: 0.0228 - acc: 0.9982 - val_loss: 0.4370 - val_acc: 0.8902
Epoch 13/20
 - 2s - loss: 0.0150 - acc: 0.9993 - val_loss: 0.4481 - val_acc: 0.8896
Epoch 14/20
 - 2s - loss: 0.0108 - acc: 0.9999 - val_loss: 0.4480 - val_acc: 0.8910
Epoch 15/20
 - 2s - loss: 0.0078 - acc: 0.9999 - val_loss: 0.4577 - val_acc: 0.8914
Epoch 16/20
 - 2s - loss: 0.0063 - acc: 0.9999 - val_loss: 0.4635 - val_acc: 0.8910
Epoch 17/20
 - 2s - loss: 0.0054 - acc: 0.9999 - val_loss: 0.4705 - val_acc: 0.8928
Epoch 18/20
 - 2s - loss: 0.0047 - acc: 0.9999 - val_loss: 0.4779 - val_acc: 0.8928
Epoch 19/20
 - 2s - loss: 0.0042 - acc: 0.9999 - val_loss: 0.4825 - val_acc: 0.8926
Epoch 20/20
 - 2s - loss: 0.0038 - acc: 0.9999 - val_loss: 0.4851 - val_acc: 0.8926
Wall time: 36.5 s
In [14]:
plt.rcParams["figure.figsize"] = [8,4]
plt.subplot(1,2,1)
plt.plot(h.history['loss'], label='Training')
plt.plot(h.history['val_loss'], label='Validation')
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend()

plt.subplot(1,2,2)
plt.plot(h.history['acc'], label='Training')
plt.plot(h.history['val_acc'], label='Validation')
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend()
plt.show()

CNN

In [15]:
np.random.seed(1)
tf.set_random_seed(1)

model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(56,56,1), activation='relu'))
model.add(MaxPooling2D((2,2)))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2)))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2,2)))

model.add(Flatten())

model.add(Dense(64, activation='relu'))
model.add(Dense(100, activation='softmax'))


opt = keras.optimizers.Adam(lr=0.001)
model.compile(loss='sparse_categorical_crossentropy', optimizer=opt, metrics=['accuracy'])

model.summary()

h = model.fit(Zs_train.reshape(-1,56,56,1), y_train,
              batch_size=256, epochs=20, verbose=2, 
              validation_data = (Zs_val.reshape(-1,56,56,1), y_val))
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d_4 (Conv2D)            (None, 54, 54, 32)        320       
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 27, 27, 32)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 25, 25, 64)        18496     
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 12, 12, 64)        0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 10, 10, 64)        36928     
_________________________________________________________________
max_pooling2d_6 (MaxPooling2 (None, 5, 5, 64)          0         
_________________________________________________________________
flatten_4 (Flatten)          (None, 1600)              0         
_________________________________________________________________
dense_9 (Dense)              (None, 64)                102464    
_________________________________________________________________
dense_10 (Dense)             (None, 100)               6500      
=================================================================
Total params: 164,708
Trainable params: 164,708
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 5000 samples
Epoch 1/20
 - 6s - loss: 1.6116 - acc: 0.4652 - val_loss: 0.5144 - val_acc: 0.8342
Epoch 2/20
 - 5s - loss: 0.3595 - acc: 0.8924 - val_loss: 0.2214 - val_acc: 0.9340
Epoch 3/20
 - 5s - loss: 0.2126 - acc: 0.9362 - val_loss: 0.1845 - val_acc: 0.9396
Epoch 4/20
 - 5s - loss: 0.1556 - acc: 0.9530 - val_loss: 0.1350 - val_acc: 0.9566
Epoch 5/20
 - 5s - loss: 0.1219 - acc: 0.9631 - val_loss: 0.1057 - val_acc: 0.9660
Epoch 6/20
 - 5s - loss: 0.0992 - acc: 0.9697 - val_loss: 0.0917 - val_acc: 0.9704
Epoch 7/20
 - 5s - loss: 0.0818 - acc: 0.9748 - val_loss: 0.1335 - val_acc: 0.9566
Epoch 8/20
 - 5s - loss: 0.0689 - acc: 0.9781 - val_loss: 0.0764 - val_acc: 0.9746
Epoch 9/20
 - 5s - loss: 0.0596 - acc: 0.9812 - val_loss: 0.0759 - val_acc: 0.9738
Epoch 10/20
 - 5s - loss: 0.0483 - acc: 0.9843 - val_loss: 0.0768 - val_acc: 0.9756
Epoch 11/20
 - 5s - loss: 0.0389 - acc: 0.9875 - val_loss: 0.0677 - val_acc: 0.9760
Epoch 12/20
 - 5s - loss: 0.0344 - acc: 0.9887 - val_loss: 0.0657 - val_acc: 0.9784
Epoch 13/20
 - 5s - loss: 0.0293 - acc: 0.9907 - val_loss: 0.0691 - val_acc: 0.9788
Epoch 14/20
 - 5s - loss: 0.0235 - acc: 0.9926 - val_loss: 0.0666 - val_acc: 0.9790
Epoch 15/20
 - 5s - loss: 0.0196 - acc: 0.9937 - val_loss: 0.0658 - val_acc: 0.9810
Epoch 16/20
 - 5s - loss: 0.0203 - acc: 0.9931 - val_loss: 0.0862 - val_acc: 0.9756
Epoch 17/20
 - 5s - loss: 0.0160 - acc: 0.9946 - val_loss: 0.0819 - val_acc: 0.9760
Epoch 18/20
 - 5s - loss: 0.0109 - acc: 0.9969 - val_loss: 0.0842 - val_acc: 0.9780
Epoch 19/20
 - 5s - loss: 0.0110 - acc: 0.9966 - val_loss: 0.0835 - val_acc: 0.9778
Epoch 20/20
 - 5s - loss: 0.0149 - acc: 0.9952 - val_loss: 0.0712 - val_acc: 0.9814
In [16]:
plt.rcParams["figure.figsize"] = [8,4]
plt.subplot(1,2,1)
plt.plot(h.history['loss'], label='Training')
plt.plot(h.history['val_loss'], label='Validation')
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend()

plt.subplot(1,2,2)
plt.plot(h.history['acc'], label='Training')
plt.plot(h.history['val_acc'], label='Validation')
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend()
plt.show()