Giter Club home page Giter Club logo

composer-sample-networks's Introduction

Dataset

For the NER experiment, I used CoNLL 2003 English dataset. This dataset includes 1,393 English and 909 German news articles. Entities are annotated with LOC (location), ORG (organisation), PER (person) and MISC (miscellaneous). This is an example sentence, where each line consists of [word] [POS tag] [chunk tag] [NER tag]

U.N. NNP I-NP I-ORG official NN I-NP O Ekeus NNP I-NP I-PER heads VBZ I-VP O for IN I-PP O Baghdad NNP I-NP I-LOC

Preprocessed Data Shapes:

X_Train - (900, 204566) X_val - (900, 46665) X_test - (900, 51577) Y_train - (10, 204566) Y_val - (10, 46665) Y_test - (10, 51577)

Each word is mapped to a pre-trained feature of size 300, hence the feature size of the whole window of size 3 is 300 X 3 = 900.

NETWORK DETAILS

We have experimented with different architectures. The common portion of all the networks is the following:

The input to the network is the pretrained features for each window. The input shape is (900, 204566) where 900 is the feature size of each window and 204566 refers to the total number of windows. The hidden layer varies between different architectures. We will describe it a bit later. The final layer is of size 10, corresponding to each NER Tag. The output of this final layer is passed to a cross entropy function for converting the output to probabilities. Different losses like log likelihood and max margin are used.

Changing the architecture is Easy. The structure of the architecture is the following:

nn_architecture = [ {"layer_size": 900, "activation": "none"}, {"layer_size": 300, "activation": "relu"}, {"layer_size": 100, "activation": "relu"}, {"layer_size": 10, "activation": "sigmoid"} ]

Different activations like sigmoid, tanh, relu, leaky relu are tried for the hidden layers.

The Dataset has class imbalance issue:

There are various methods for tackling this issue:

  • Duplicating the infrequent classes: Does not provide any new information to model.
  • Downscale the most frequent classes: Results in a lot of loss of data.
  • Focal Loss: This is a really good way for dealing with class imbalance. It puts more weight on harder or infrequent samples thus making the model to focus on infrequent samples too.

I used the Synthetic Minority Oversampling Technique (SMOTE) approach [1]. SMOTE first selects a minority class instance at random and finds its k nearest minority class neighbors. The synthetic instance is then created by choosing one of the k nearest neighbors b at random and connecting a and b to form a line segment in the feature space. The synthetic instances are generated as a convex combination of the two chosen instances a and b.

This process is highly memory intensive. Further it requires a certain number of samples of each class present for successful interpolation. Hence I divided my data into batches of 10000 windows and applied the SMOTE on each of them.

After applying SMOTE to a 10000 batch: Counter({3: 7585, 1: 7585, 8: 7585, 0: 7585, 7: 7585, 4: 7585, 5: 7585, 9: 7585, 6: 7585, 2: 7585})

Before applying SMOTE to a 10000 batch:

Counter({1: 843, 4: 41, 0: 23, 8: 22, 3: 19, 5: 18, 7: 16, 9: 12, 6: 5, 2: 1})

Notice the very small number of samples of type 6, 2, 9 etc. are normalised after applying SMOTE.

Coding File Details:

  • CORNLL.ipynb: Preprocess the data and extract features
  • NER_NN.ipynb: Neural Network Implementation
  • NER_NN_balanced.ipynb: Neural Network Implementation with SMOTE
  • Other .py files: Supporting code

References:

Chawla, Nitesh V., et al. "SMOTE: synthetic minority over-sampling technique." Journal of artificial intelligence research 16 (2002): 321-357.

composer-sample-networks's People

Contributors

andrew-coleman avatar awjh-ibm avatar bestbeforetoday avatar cazfletch avatar ceseale avatar davidkel avatar dselman avatar edmoffatt avatar ellishenderson avatar erin-hughes avatar fabric-composer-app avatar fanarito avatar flibustier avatar gangachris avatar hannahrayner avatar hpurmann avatar jakeeyturner avatar jordangraft avatar jt-nti avatar jwagantall avatar liam-grace avatar mbwhite avatar nklincoln avatar shacshar avatar tobias-hunter avatar vmorris avatar winslet avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.