Giter Club home page Giter Club logo

avenger_facenet's Introduction

Avengers Facial Recognition

This project uses State of the Art Facial Recognition model pruposed by Google called FaceNet. Facenet uses deep convolutional networks along with triplet loss to achieve state of the art accuracy.

In this project we used NN4 Small2 v1 , an Inception model with 96x96 images as input. We have used a pretrained model from OpenFacePytorch, which was trained on OpenFace Dataset. Transfer Learning was then applied to train the classifier on the Avengers Dataset.

We have also used MTCNN (MultiTask Cascaded Convolution Network) from facenet-pytorch to crop and align the faces

Installation

Docker

A docker image for this project is Available here : jovian19/pytorch

docker pull jovian19/pytorch

Run the docker container using this command

docker run --rm -it -v <our local dataset path>:/dataset jovain19/pytorch:latest bash

Requirements

  • pytorch==1.8.1
  • facenet-pytorch
  • numpy
  • scikit-learn
  • tqdm
  • pillow
  • matplotlib
  • torchvision
  • torchaudio
  • cudatoolkit=10.2

Visualize the Avenger Dataset

The dataset contains around 50 cropped face images of each avenger. This dataset can be downloaded from here : https://www.kaggle.com/rawatjitesh/avengers-face-recognition

  • chris_evans
  • chris_hemsworth
  • mark_ruffalo
  • robert_downey_jr
  • scarlett_johansson

Here is a subset of the dataset.

png

Triplet Loss and Triplet Generator

Here we train the model such it learns the face embeddings f(x) from the image $x$ such that the squared L2 distance between all faces of the same identity is small and the distance between a pair of faces from different identities is large.

This can be achieved with a triplet loss L as defined by

triplet_loss

This loss minimizes the distance between an anchor image xa and a positive image xp and maximizes the between the anchor image xa and a negative image xn

The generate_triplets function generates these positive and negative images for the entire batch. The current implementation randomly chooses the positive and negative images from the current batch. This can easily be enhanced to select difficult triplets to make the model train better.

The difficult triplet can be generated by selecting the positive image having the highest distance from the anchor and similarly selcting the negative image having smallest distance from the anchor

# Generate triplets
def generate_triplets(images, labels):
    positive_images = []
    negative_images = []
    batch_size = len(labels)
    
    for i in range(batch_size):
        anchor_label = labels[i]

        positive_list = []
        negative_list = []

        for j in range(batch_size):
            if j != i:
                if labels[j] == anchor_label:
                    positive_list.append(j)
                else:
                    negative_list.append(j)

        positive_images.append(images[random.choice(positive_list)])
        negative_images.append(images[random.choice(negative_list)])

    positive_images = torch.stack(positive_images)
    negative_images = torch.stack(negative_images)
    
    return positive_images, negative_images

class TripletLoss(nn.Module):
    def __init__(self, alpha=0.2):
        super(TripletLoss, self).__init__()
        self.alpha = alpha
    
    def calc_euclidean(self, x1, x2):
        return (x1 - x2).pow(2).sum(1)
    
    def forward(self, anchor, positive, negative): # (batch_size , emb_size)
        distance_positive = self.calc_euclidean(anchor, positive)
        distance_negative = self.calc_euclidean(anchor, negative)
        losses = torch.relu(distance_positive - distance_negative + self.alpha)
        return losses.mean()

Visualizing the Output

As we can see the model is able to generate the face embeddings for the dataset. Now if had to use just the distance between these embeddings to predict the faces, we would get an accuracy close to 96.5%.

png

2D visualization of the embedded space using TSNE. From the below diagram we can see that the model is able to generate face embeddings that are easily distinguishable for different faces

png

Transfer Learning a new classifier

The above model just outputs a face embedding for the image. To create a classifer for the Avenger Dataset we add a new nn.Linear layer at the end, this layer takes in the face embedding and predicts the class label.

Since we only need to train the final layer, we freeze the parameters for all layers except the final layer.

We also defined the optimizer to take only the final layer parameters and a CrossEntropyLoss function

Using the Classifer for doing Predictions

png

chris_evans with 98.71% probability

png

scarlett_johansson with 94.83% probability

png

chris_hemsworth with 99.09% probability

png

UNKNOWN FACE, but similar to mark_ruffalo with 49.56% probability

png

robert_downey_jr with 99.05% probability

png

mark_ruffalo with 95.88% probability

avenger_facenet's People

Contributors

jovian-dsouza avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.