Giter Club home page Giter Club logo

Comments (10)

habeanf avatar habeanf commented on May 29, 2024

@kirillkh Unfortunately, I can't publish the licensing terms because they are unknown to me too. I was given the all-clear to put the BGU lexicon on github by @rtsarfaty, you would have to contact her about the licensing terms, as this is a potentially thorny issue.

I will provide documentation for the format.

from yap.

kirillkh avatar kirillkh commented on May 29, 2024

I suppose these files are very hard to create if you had to bundle them under such terms as opposed to generating new ones from scratch?

from yap.

matanox avatar matanox commented on May 29, 2024

@kirillkh of course they are hard to generate :-) someone has curated over 500,000 words along their morphological properties each, or generated them along their correct morphological properties each, while also accounting for a lot of homomorphisms. Especially if the lexicon was quality reviewed, it is not trivial, to create an equivalent from scratch.

from yap.

matanox avatar matanox commented on May 29, 2024

@habeanf I wonder what have been the minor changes you have made to the original BGU lexicon in order to better accommodate for the tasks at hand... this might give some clues as to interesting properties of the original lexicon (which I believe is only available by request from the original authors). Might this be easy to comment about?

from yap.

habeanf avatar habeanf commented on May 29, 2024

@matanster @kirillkh Earlier this week there was a meeting with Alon Itai, head of MILA and one of the curators of the original lexicon from which the BGU lexicon was generated. The issue of licensing for this file is under discussion. The problem is that funding for this resource, as well as other MILA resources, came from the Israeli Government's Ministry of Science (משרד המדע). At the time (circa 2003), the government required that any resources resulting from projects it funded would cost money for commercial entities but would be freely available for academic research.
These days there are discussions to "open up" the licensing such that they will be commercial-friendly, probably CC-BY-SA (like MIT/Apache). Honestly, if you use the resources I think no one will come looking for you, but I don't have the right to guarantee this.
Edited: Exact license names

from yap.

habeanf avatar habeanf commented on May 29, 2024

@kirillkh If you can pay for licensing, you will have to reach out to Alon Itai at MILA. For a hefty sum, MILA will give you the right to use the lexicon for commercial purposes (like a parser). In any case you will not be granted the right to publish it with an open license.

from yap.

matanox avatar matanox commented on May 29, 2024

I've looked at it again. It may seem as if also the Hebrew treebank included, may carry license terms more restrictive that than the library's own license.

Maybe I do not have available some relaxation of the license terms that may seem to apply to that tree bank, or to the updated version of it embedded in this repo.

from yap.

matanox avatar matanox commented on May 29, 2024

Apologies for waking up this dead thread.

from yap.

habeanf avatar habeanf commented on May 29, 2024

@matanster I was instructed by my advisor at the time (@rtsarfaty) to publish the treebank and lexicon as part of the github repository for the parser. If anyone wants to know licensing particularities of the treebank and/or lexicon, they can reach out to her or MILA.

from yap.

matanox avatar matanox commented on May 29, 2024

Of course!

from yap.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.