roxxhub

CNN: Python implementation

Convolutional Neural Network (CNN) is a class of artificial neural network which is specifically used for image and videos related problems. Generally image or video classification. Before jumping into how we implement CNN using python, you might wanna read about the basics of CNN first.

In this tutorial we are going to use a framework called PyTorch. Pytorch is an amazing tool used for deep learning in python. It is even widely used at the industrial level. So without further ado, lets begin.

Collecting and preprocessing data

In our case, we’ll be using the popular CIFAR10 data. The CIFAR-10 dataset consists of 60000 32×32 color images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. These 10 classes or you can say images type are airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck. First To download the data in your computer we’ll use the TORCHVISON library.

import torchvision as tv
tv.datasets.CIFAR10(root: 'data/', download = True)

This way torchvision creates a ‘data’ folder in the same folder your project is. You can store it anywhere you want in your device by writing that down in the root parameter.

I already have it in my device so lets see how to actually get started by importing the data

import warnings
warnings.filterwarnings('ignore')
import torch as t
import torchvision as tv
import matplotlib.pyplot as mp

# data --------------------------------------------------------------------------------------------------------------
train_val_data = tv.datasets.CIFAR10(root='data/', transform=tv.transforms.ToTensor())
test_images = tv.datasets.CIFAR10(root='data/', train=False)
test = tv.datasets.CIFAR10(root='data/', train=False, transform=tv.transforms.ToTensor())
targets = ('airplane', 'automobile', 'bird',   'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
# data --------------------------------------------------------------------------------------------------------------

# data preprocessing

from torch.utils.data import DataLoader, random_split

train, val = random_split(train_val_data, [40000, 10000])
batch_size = 130                                                              # h1
train = DataLoader(train, batch_size=batch_size, shuffle=True)
val = DataLoader(val, batch_size=batch_size)
test_loader = DataLoader(test, batch_size=batch_size)

Alright lets see what happened here. First we imported all the libraries. Then we imported the downloaded data in two parts, train_val_data (training data), and test(test data) Whereas test_images, is the same as test but is not in the tensor form. We will use it to see the image ourself when testing our model. Then we split train_val_data into train and val dataset. Then we make data loaders of train, test and val with a certain batch size ()130

Designing the model

import torch.nn as nn
from torch.nn.functional import softmax, cross_entropy

class cnn(nn.Module):
    def __init__(self):
        super().__init__()
        self.seq = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),  # output: 64 x 16 x 16

            nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(128, 128, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),  # output: 128 x 8 x 8

            nn.Conv2d(128, 256, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.Conv2d(256, 256, kernel_size=3, stride=1, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2, 2),  # output: 256 x 4 x 4

            nn.Flatten(),
            nn.Linear(256 * 4 * 4, 1024),
            nn.ReLU(),
            nn.Linear(1024, 512),
            nn.ReLU(),
            nn.Linear(512, 10)

        )

    def forward(self, image):
        return self.seq(image)

    def predict(self, image):
        x = self.forward(image)
        x = softmax(x, dim=1)
        _, preds = t.max(x, dim=1)
        return preds

    def accuracy(self, x, y):
        return t.sum(x == y).item() / len(x)


model = cnn()

loss_fn = cross_entropy
lr = 1e-3                                                                    # h2
opt = t.optim.SGD(model.parameters(), lr=lr)

nn.Module is an in – built class. Basically what we are doing here is called inheritance. It is a really helpful feature of classes in python. torch.nn module helps us to create a neural network easily. Lets dissect each layer of cnn here.

in nn.conv2d we give the no. channels (always 3 in rgb) of our input image, the no. of channel we want as our output, feature size and the padding. Hence convolving our rgb image having 3 channels padded by one pixel around it with a 3×3 feature image.

cnn- convolution

Then nn.Relu applies the relu non-linear function to the output. And nn.maxpool2d basically does the pooling. .

We repeat their combination several times. As our cost function we’ll use cross entropy.

Training

x_axis = []
y_axis = []


def fit(num_epochs):
    for epoch in range(num_epochs):
        for images, labels in train:
            x = model.forward(images)
            loss = loss_fn(x, labels)
            loss.backward()
            opt.step()
            opt.zero_grad()

        x_axis.append(epoch + 1)
        acc_per_batch = []
        with t.no_grad():
            for a, b in val:
                acc_ = model.accuracy(model.predict(a), b)
                acc_per_batch.append(acc_)
        y_axis.append(sum(acc_per_batch) / len(acc_per_batch))
        print(epoch+1, '-----', sum(acc_per_batch) / len(acc_per_batch))
fit(15)
t.save(model.state_dict(), 'cifar10.pth')

Here we define our training function and train it 15 times. After doing so, we save our model.

Testing our CNN

Lets see if we were to generate prediction using our model. Lets say on data no. 100 of the testing dataset

x = 100
print('prediction --', targets[model.predict(test[x][0].reshape(1, 3, 32, 32))])
print('real --', targets[test[x][1]])

Well this is what I got ….

prediction -- deer
real -- deer

lets see the image too

mp.imshow(test_images[x][0])
mp.show()
cifar10- deer

2 thoughts on “CNN: Python implementation”

  1. The next time I read a blog, I hope that it doesnt disappoint me as much as this one. I mean, I know it was my option to read, but I actually thought youd have one thing attention-grabbing to say. All I hear is a bunch of whining about something that you may repair if you werent too busy on the lookout for attention.

Comments are closed.