Multi Class Classification From Scratch (Part 1)
Date: 15/07/2022
Author: @kavindu404
In this mini blog series, I am implementing multiclass classifier for MNIST digits from scratch. In this part, I will be classifying the digits using pixel similarity. I will try to improve the performance in each part. First, let's import FastAI
from fastai.vision.all import *
MNIST dataset can be downloaded and extracted using untar_data() method. With FastAI, we can easily list the elements in the extracted derectory.
path = untar_data(URLs.MNIST)
Path.BASE_PATH = path
path.ls()
Let's first get training data into different objects. The ls() method returns an object of class L in FastAI.It has all the functionalities in python list() and some more.
zeros = (path/'training'/'0').ls().sorted()
ones = (path/'training'/'1').ls().sorted()
twos = (path/'training'/'2').ls().sorted()
threes = (path/'training'/'3').ls().sorted()
fours = (path/'training'/'4').ls().sorted()
fives = (path/'training'/'5').ls().sorted()
sixes = (path/'training'/'6').ls().sorted()
sevens = (path/'training'/'7').ls().sorted()
eights = (path/'training'/'8').ls().sorted()
nines = (path/'training'/'9').ls().sorted()
zeros,ones,twos,threes,fours,fives,sixes,sevens,eights,nines
Now that we have all the data seperated into objects, let's stack them up.
stacked_zeros = torch.stack([tensor(Image.open(o)) for o in zeros]).float()/255
stacked_ones = torch.stack([tensor(Image.open(o)) for o in ones]).float()/255
stacked_twos = torch.stack([tensor(Image.open(o)) for o in twos]).float()/255
stacked_threes = torch.stack([tensor(Image.open(o)) for o in threes]).float()/255
stacked_fours = torch.stack([tensor(Image.open(o)) for o in fours]).float()/255
stacked_fives = torch.stack([tensor(Image.open(o)) for o in fives]).float()/255
stacked_sixes = torch.stack([tensor(Image.open(o)) for o in sixes]).float()/255
stacked_sevens = torch.stack([tensor(Image.open(o)) for o in sevens]).float()/255
stacked_eights = torch.stack([tensor(Image.open(o)) for o in eights]).float()/255
stacked_nines = torch.stack([tensor(Image.open(o)) for o in nines]).float()/255
stacked_zeros.shape, stacked_ones.shape, stacked_twos.shape, stacked_threes.shape, stacked_fours.shape, stacked_fives.shape, stacked_sixes.shape, stacked_sevens.shape, stacked_eights.shape, stacked_nines.shape,
In our first attempt, we will use pixel similarity. So, first, let's calculate the mean for each digit.
mean0 = stacked_zeros.mean(0)
mean1 = stacked_ones.mean(0)
mean2 = stacked_twos.mean(0)
mean3 = stacked_threes.mean(0)
mean4 = stacked_fours.mean(0)
mean5 = stacked_fives.mean(0)
mean6 = stacked_sixes.mean(0)
mean7 = stacked_sevens.mean(0)
mean8 = stacked_eights.mean(0)
mean9 = stacked_nines.mean(0)
The mean for each digit represents the 'ideal' digit that is expected. Let's take a look at the 'ideal' 2.
df1 = pd.DataFrame(mean2[0:29,0:23])
df1.style.set_properties(**{'font-size':'4.5pt'}).background_gradient('Greys')
im = stacked_ones[1]
show_image(im)
Now, let's collect validation dataset and stack them up.
valid_zeros = (path/'testing'/'0').ls().sorted()
valid_ones = (path/'testing'/'1').ls().sorted()
valid_twos = (path/'testing'/'2').ls().sorted()
valid_threes = (path/'testing'/'3').ls().sorted()
valid_fours = (path/'testing'/'4').ls().sorted()
valid_fives = (path/'testing'/'5').ls().sorted()
valid_sixes = (path/'testing'/'6').ls().sorted()
valid_sevens = (path/'testing'/'7').ls().sorted()
valid_eights = (path/'testing'/'8').ls().sorted()
valid_nines = (path/'testing'/'9').ls().sorted()
valid_stacked_zeros = torch.stack([tensor(Image.open(o)) for o in valid_zeros]).float()/255
valid_stacked_ones = torch.stack([tensor(Image.open(o)) for o in valid_ones]).float()/255
valid_stacked_twos = torch.stack([tensor(Image.open(o)) for o in valid_twos]).float()/255
valid_stacked_threes = torch.stack([tensor(Image.open(o)) for o in valid_threes]).float()/255
valid_stacked_fours = torch.stack([tensor(Image.open(o)) for o in valid_fours]).float()/255
valid_stacked_fives = torch.stack([tensor(Image.open(o)) for o in valid_fives]).float()/255
valid_stacked_sixes = torch.stack([tensor(Image.open(o)) for o in valid_sixes]).float()/255
valid_stacked_sevens = torch.stack([tensor(Image.open(o)) for o in valid_sevens]).float()/255
valid_stacked_eights = torch.stack([tensor(Image.open(o)) for o in valid_eights]).float()/255
valid_stacked_nines = torch.stack([tensor(Image.open(o)) for o in valid_nines]).float()/255
In order to get the pixel similarity, we have to get the distance from the 'ideal' digit for each digit. First, we have to check the distance for each 'ideal' digit and then choose the closest one. In distance() method, we simply get the distance between two inputs. In min_distance() method, we find the closest 'ideal' digit for a given input. In is_correct() method, we can simply determine whether our prediction using pixel similarity is correct or not.
def distance(x,y): return (x-y).abs().mean((-1,-2))
mean_vec = [mean0, mean1, mean2, mean3, mean4, mean5, mean6, mean7, mean8, mean9]
def min_distance(x):
distances = [distance(x, o) for o in mean_vec]
return distances.index(min(distances))
def is_correct(num, x): return num == min_distance(x)
Let's check with some inputs.
is_correct(4, valid_stacked_ones[140])
Now that we have guranteed it is working fine, let's calculate the accuracy of the model. In here, we will simply get the correct prediction per each class and then get the mean of it.
acc_zeros = tensor([is_correct(0,o) for o in valid_stacked_zeros]).float().mean()
acc_ones = tensor([is_correct(1,o) for o in valid_stacked_ones]).float().mean()
acc_twos = tensor([is_correct(2,o) for o in valid_stacked_twos]).float().mean()
acc_threes = tensor([is_correct(3,o) for o in valid_stacked_threes]).float().mean()
acc_fours = tensor([is_correct(4,o) for o in valid_stacked_fours]).float().mean()
acc_fives = tensor([is_correct(5,o) for o in valid_stacked_fives]).float().mean()
acc_sixes = tensor([is_correct(6,o) for o in valid_stacked_sixes]).float().mean()
acc_sevens = tensor([is_correct(7,o) for o in valid_stacked_sevens]).float().mean()
acc_eights = tensor([is_correct(8,o) for o in valid_stacked_eights]).float().mean()
acc_nines = tensor([is_correct(9,o) for o in valid_stacked_nines]).float().mean()
acc= tensor([acc_zeros, acc_ones, acc_twos, acc_threes, acc_fours, acc_fives, acc_sixes, acc_sevens, acc_eights, acc_nines]).mean()
acc
So, we have an accuracy of 66.1%. Given that we only considered pixel similarity, it is a good result. In next part, let's try to improve from here.