Giter Club home page Giter Club logo

Comments (4)

ICLRandD avatar ICLRandD commented on August 16, 2024

The following suggestions come courtesy of Pete Smith:

ENTITY TYPE: Counsel (Lawyer...?)
ENTITY DESCRIPTION: Detects mentions of legal representatives
LEGAL TOPIC: General
EXAMPLE: Rumpole, H
ENTITY TYPE: Command paper
ENTITY DESCRIPTION: Detects mention of policy documents
LEGAL TOPIC: General
EXAMPLE: Students at the heart of the system Cm 8122
ENTITY TYPE: Book
ENTITY DESCRIPTION: Detects mention of legal treatise / academic work
LEGAL TOPIC: General
EXAMPLE: Halsbury's Laws of England (5th edition) Volume  99 Taxation Law (2018)
ENTITY TYPE: Treaty International Organisations
ENTITY DESCRIPTION: Detects mention of inter-state organisations
LEGAL TOPIC: General
EXAMPLE: United Nations
ENTITY TYPE: Private International Organisations
ENTITY DESCRIPTION: Detects mention of international organisations of a private nature, but not businesses
LEGAL TOPIC: General
EXAMPLE: FIFA, IBA
ENTITY TYPE: Government department
ENTITY DESCRIPTION: Detects mention of government department
LEGAL TOPIC: General
EXAMPLE: Ministry of Justice

from blackstone.

DeNeutoy avatar DeNeutoy commented on August 16, 2024

Bit of advice:

It seems very unlikely to me that a model you train will be able to tell the difference between Private International Organisations and Treaty International Organisations.

Consider that the model has literally zero knowledge about the world and is essentially operating on features extracted from the text only. As an example, IBA is a Private International Org, the IMF is a Treaty International Org and IBM is neither. Generally speaking it is very difficult to distinguish these cases for a statistical model.

Similar comments apply to the difference between Lawyer and Judge, although I can imagine that they are often referred to with different titles etc, so maybe it is slightly more possible.

In comparison, Book is a great example of a Named Entity which is likely to work well, because there are things common across mentions of books, such as refererences to pages, publishing dates, editions, consistent title capitalisation. I don't know enough about what a Command Paper is to know if it is mentioned in a way separate from a book.

Government department also seems like a reasonable NER label, I think.

from blackstone.

pommedeterresautee avatar pommedeterresautee commented on August 16, 2024

At some point the model is able to memorize things, and even if it has zero world knowledge, seeing enough data points is often good enough. For instance, it can remember that the word Treaty is a text, and the word organization is an organization, then learn how to use both of them.
More over, pretrained language model (Spacy have its own) is a way to get world knowledge.

from blackstone.

akatie avatar akatie commented on August 16, 2024

I would be pleased to volunteer

from blackstone.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.