Giter Club home page Giter Club logo

ner-conll's Introduction

CoNLL

This project is from my Master Thesis. We are going to see 4 out-of-the-box tools available completely free of charge using the Python language. These are NLP libraries that provide the possibility of doing Name Entity Recognition (NER) .

The dataset on which we trained the models is CoNLL-2003 from the Conference on Computational Natural Language Learning. The four NLP techniques used to create this tool are increasingly popular in the modern landscape:

  • Flair is a powerful NLP (Natural Language Processing) library, open-sourced and developed by Zalando Research;
  • SpaCy is also an open-sourced library and free for advanced NLP in Python. SpaCy is designed for production use in applications that have to process big volumes of texts.
  • Then we will use the Conditional Random Fields, a class of discriminative models to prediction tasks where contextual information are used in the prediction. We will train a CRF model for named entity recognition using sklearn-crfsuite.
  • To conclude, the last used tool we make available is BERT, Bidirectional Encoder Representations from Transformers, a new language representation model created by Google. BERT is created to pretrain deep bidirectional model with unlabeled text by conditioning on both left and right context in all layers.

The shared task of CoNLL-2003 concerns language-independent named entity recognition.
The structure of the training and test data is concentrated to have four types of named entities: persons, locations, organizations and names of miscellaneous entities not belonging to the previous groups.

ner-conll's People

Contributors

ch-rode avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.