Giter Club home page Giter Club logo

sigver's Introduction

Learning representations for Offline Handwritten Signature Verification

This repository contains code to train CNNs for feature extraction for Offline Handwritten Signatures, and code to train writer-dependent classifiers [1]. It also contains code to train two countermeasures for Adversarial Examples, as described in [2] (the code to run the experiments from this second paper can be found in this repository).

[1] Hafemann, Luiz G., Robert Sabourin, and Luiz S. Oliveira. "Learning Features for Offline Handwritten Signature Verification using Deep Convolutional Neural Networks" http://dx.doi.org/10.1016/j.patcog.2017.05.012 (preprint)

[2] Hafemann, Luiz G., Robert Sabourin, and Luiz S. Oliveira. "Characterizing and evaluating adversarial examples for Offline Handwritten Signature Verification" https://doi.org/10.1109/TIFS.2019.2894031 (preprint)

This code is a re-implementation in Pytorch (original code for [1] was written in theano+lasagne: link).

Installation

This package requires python 3. Installation can be done with pip:

pip install git+https://github.com/luizgh/sigver.git  --process-dependency-links

You can also clone this repository and install it with pip install -e <path/to/repository> --process-dependency-links

Usage

Data preprocessing

The functions in this package expect training data to be provided in a single .npz file, with the following components:

  • x: Signature images (numpy array of size N x 1 x H x W)
  • y: The user that produced the signature (numpy array of size N )
  • yforg: Whether the signature is a forgery (1) or genuine (0) (numpy array of size N )

We provide functions to process some commonly used datasets in the script sigver.datasets.process_dataset. As an example, the following code pre-process the MCYT dataset with the procedure from [1] (remove background, center in canvas and resize to 170x242)

python -m sigver.preprocessing.process_dataset --dataset mcyt \
 --path MCYT-ORIGINAL/MCYToffline75original --save-path mcyt_170_242.npz

During training a random crop of size 150x220 is taken for each iteration. During test we use the center 150x220 crop.

Training a CNN for Writer-Independent feature learning

This repository implements the two loss functions defined in [1]: SigNet (learning from genuine signatures only) and SigNet-F (incorporating knowledge of forgeries). In the training script, the flag --users is use to define the users that are used for feature learning. In [1], GPDS users 300-881 were used (--users 300 881).

Training SigNet:

python -m sigver.featurelearning.train --model signet --dataset-path  <data.npz> --users [first last]\ 
--model signet --epochs 60 --logdir signet  

Training SigNet-F with lambda=0.95:

python -m sigver.featurelearning.train --model signet --dataset-path  <data.npz> --users [first last]\ 
--model signet --epochs 60 --forg --lamb 0.95 --logdir signet_f_lamb0.95  

For checking all command-line options, use python -m sigver.featurelearning.train --help. In particular, the option --visdom-port allows real-time monitoring using visdom (start the visdom server with python -m visdom.server -port <port>).

Training WD classifiers and evaluating the result

For training and testing the WD classifiers, use the sigver.wd.test script. Example:

python -m sigver.wd.test -m signet --model-path <path/to/trained_model> \
    --data-path <path/to/data> --save-path <path/to/save> \
    --exp-users 0 300 --dev-users 300 881 --gen-for-train 12

Where trained_model is a .pth file (trained with the script above, or pre-trained - see the section below). This script will split the dataset into train/test, train WD classifiers and evaluate then on the test set. This is performed for K random splits (default 10). The script saves a pickle file containing a list, where each element is the result of one random split. Each item contains a dictionary with:

  • 'all_metrics': a dictionary containing:
    • 'FRR': false rejection rate
    • 'FAR_random': false acceptance rate for random forgeries
    • 'FAR_skilled': false acceptance rate for skilled forgeries
    • 'mean_AUC': mean Area Under the Curve (average of AUC for each user)
    • 'EER': Equal Error Rate using a global threshold
    • 'EER_userthresholds': Equal Error Rate using user-specific thresholds
    • 'auc_list': the list of AUCs (one per user)
    • 'global_threshold': the optimum global threshold (used in EER)
  • 'predictions': a dictionary containing the predictions for all images on the test set:
    • 'genuinePreds': Predictions to genuine signatures
    • 'randomPreds': Predictions to random forgeries
    • 'skilledPreds': Predictions to skilled forgeries

The example above train WD classifiers for the exploitation set (users 0-300) using a development set (users 300-881), with 12 genuine signatures per user (this is the setup from [1] - refer to the paper for more details). For knowing all command-line options, run python -m sigver.wd.test -m signet.

Pre-trained models

Pre-trained models can be found here:

  • SigNet (link)
  • SigNet-F lambda 0.95 (link)

These models contains the weights for the feature extraction layers.

Important: These models were trained with pixels ranging from [0, 1]. Besides the pre-processing steps described above, you need to divide each pixel by 255 to be in the range. This can be done as follows: x = x.float().div(255). Note that Pytorch does this conversion automatically if you use torchvision.transforms.totensor, which is used during training.

Usage:

import torch
from sigver.featurelearning.models import SigNet

# Load the model
state_dict, classification_layer, forg_layer = torch.load('models/signet.pth')
base_model = SigNet().eval()
base_model.load_state_dict(state_dict)

# Extract features
with torch.no_grad(): # We don't need gradients. Inform torch so it doesn't compute them
    features = base_model(input)

See example.py for a complete example. For a jupyter notebook, see this interactive example.

Citation

If you use our code, please consider citing the following papers:

[1] Hafemann, Luiz G., Robert Sabourin, and Luiz S. Oliveira. "Learning Features for Offline Handwritten Signature Verification using Deep Convolutional Neural Networks" http://dx.doi.org/10.1016/j.patcog.2017.05.012 (preprint)

[2] Hafemann, Luiz G., Robert Sabourin, and Luiz S. Oliveira. "Characterizing and evaluating adversarial examples for Offline Handwritten Signature Verification" (preprint)

License

The source code is released under the BSD 3-clause license. Note that the trained models used the GPDS dataset for training, which is restricted for non-commercial use.

Please do not contact me requesting access to any particular dataset. These requests should be addressed directly to the groups that collected them.

sigver's People

Contributors

atinesh-s avatar gonultasbu avatar luizgh avatar

Stargazers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.