Giter Club home page Giter Club logo

named-entity-recognition's Introduction

Named Entity Recognition

In Natural Language Processing (NLP) an Entity Recognition is one of the common problem. The entity is referred to as the part of the text that is interested in. In NLP, NER is a method of extracting the relevant information from a large corpus and classifying those entities into predefined categories such as location, organization, name and so on. Information about lables:

  • geo = Geographical Entity

  • org = Organization

  • per = Person

  • gpe = Geopolitical Entity

  • tim = Time indicator

  • art = Artifact

  • eve = Event

  • nat = Natural Phenomenon

      1. Total Words Count = 1354149 
      2. Target Data Column: Tag
    

named-entity-recognition's People

Contributors

akshayc1 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

named-entity-recognition's Issues

Testing the model

Hi,
I have queries regarding testing the model.could you please help with testing the below sentence
Ex: Narendra Modi is the 15th prime minister of India from 26 May 2014 to 26 May 2019.

In which format should the above sentence be given to model.predict.It would be really helpful if you could share the snippet for testing the above sentence. @Akshayc1

sent2feature at test_data

Hi,
I read your code at NER using CRF.ipynb and I have a question about using sent2features at test_data.
Is it appropriate to use sent2features at test_data ? At sent2features it seems that you are giving test_data a data we should not know (POS tag of previous and continuing word). Rather than giving the gold data of POS tags, I think that the model should use their own prediction result.
Although most of the pos tagging CRF tutorials does it the same way, I want to ask your opinion and whether it is possible to implement it.
Thank you in advance

X, Y calling the wrong list giving error, solution suggested

In file NER using Bidirectional LSTM - CRF .ipynb
Refer code block no. 31 & 32.

The for loop is looping on the list 'sentences' which is making it split down to characters.
Capture

Instead if we loop on the list 'sent' & then 'sentences' we will be able to access the word, pos & tag. Attaching debugging steps:
debug1
debug2

Final solution:
final_solution

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.