Giter Club home page Giter Club logo

Comments (6)

johann-petrak avatar johann-petrak commented on June 21, 2024

Could you please be more specific about what exactly breaks when and what your suggestion would be to change or add functionality?

from gateplugin-stringannotation.

ajdocherty avatar ajdocherty commented on June 21, 2024

We cannot point the plugin to bundled resources as it expects a physical file, probably due to this file that is created by the plugin to run (the binary lookup file). So for now we have taken the plugin out of our pipeline as building the project with maven and then running the service, seems to be problematic given this creation of a file on the file system. If you had an option to load the gazetteer normally without this physical file dependency then that would be great.

from gateplugin-stringannotation.

johann-petrak avatar johann-petrak commented on June 21, 2024

OK, so you mean some mechanism where the gazetteer list can be loaded from a JAR or some other URL?
The gazetteer builds a highly optimized trie datastructure from the original lists, which takes a while. For this reason, the datastructure is written into a .gazbin file as a cache. Not having this cache would mean that every time the lists are loaded, the optimization and compilation into the trie has to be done first.
Could you give an example for how you would imagine specifying the gazetteer lists in a way that would be compatible with your deployment requirements?

from gateplugin-stringannotation.

ajdocherty avatar ajdocherty commented on June 21, 2024

The default gazetteers suffice, so if the optimized trie cannot be built without this cache then provide an option for a normal, unoptimized lookup Gate style, where resources can be loaded from the JAR.

By the way, see https://github.com/npgall/concurrent-trees for an implementation of efficient in-memory tries that are thread safe, in case you want to explore an alternative to writing to a cache.

from gateplugin-stringannotation.

johann-petrak avatar johann-petrak commented on June 21, 2024

The optimized trie created by the gazetteer pr already is thread safe.
If you do not need the optimization (which increases both memory efficiency for huge gazetteer lists and lookup speed), then maybe the default gazetteer included in the standard GATE distribution is a better option?
There will be some kind of support for loading resources from a JAR eventually, but this will have to wait until after the first pre-release version of the next GATE version.

For this issue, I need a concrete description of what you propose should get changed or implemented so that developers can decide whether to implement it and to take it.

from gateplugin-stringannotation.

ajdocherty avatar ajdocherty commented on June 21, 2024

No worries, we'll just use the default gazetteer. Thanks!

from gateplugin-stringannotation.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.