Giter Club home page Giter Club logo

geni's People

Contributors

gabriella439 avatar gwern avatar kowey avatar lauhaide avatar lorenzoale avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

geni's Issues

geni accepts malformed rootfeat

The square brackets should either be optional, or mandatory but enforced. Right now they are silently accepted and ignored.

outsource to graphviz library

Note the graphviz is licensed under the EPL (sigh!) so we would have to create an exception in our GPL to allow people to distribute the two as one program.

geniserver should be able to output morph features

Right now you can only do morphological realisation on the server side.

It would be nice if geniserver could somehow return feature structures so that clients can do morphological realisation themselves.

percolate features during morphological realisation

The morphological realisation (built-in) is dumb in that it unifies each pre-terminal node of the derived tree independently with the morphological lexicon.

This is not good, because it does not allow for mutually exclusive realisations:
he hold_s_ the apple vs you hold the apple

Right now, the workaround is to supply the necessary features via the input semantics (morphinfo file), but ideally you should be able to just make it work automatically.

simple indexing mechanism

This should reduce generation time somewhat. It occurred to me that there's actually a very simple way to index the generation chart: just use the semantic index or the category or even a tuple of the root node. For atomic disjunction and variables, just dump into a variable slot that we always have to look up.

Note that substitution would have to be changed so that items with open substitution sites go back at the end of the agenda instead of on the chart.

Also note that substitution sites with disjunctive or variable indices would just have to look at all chart items.

disjunctions of paraphrase selectors

Luciana Benotti asked:

We know that in Geni you can select tree properties in the input
semantics by using square brackets next to a literal, for example,
runs(e,j)[Active]. Is there a way of indicating a disjunction of the
tree properties? Such as runs(e,j)[Active|Passive] in order to obtain
the active and passive realizations of runs(e,j).

remove IAF code

The index accessibility filtering code is not being maintained.
It should just be removed to reduce cruft.

possible polarity filtering bug

{{{
dist/build/geni/geni -m examples/chatnoir/macros -l examples/chatnoir/lexicon -s examples/chatnoir/suite --verbose --testcase="le_mechant_chat_noir_chasser_le_souris" --opts='pol' --rootfeat='cat:p'
Loading test suite examples/chatnoir/suite... 4 entries
Loading trees examples/chatnoir/macros... 11 entries
Loading lexicon examples/chatnoir/lexicon... 15 entries
Loading test suite examples/chatnoir/suite... 4 entries
Lexical items selected:
noir
chat
chasser
le
le
mechant
souris

Trees anchored (family) :
noir:adj_post
chat:nC
chasser:vArity2:rel1vn0
chasser:vArity2:rel0vn1
chasser:vArity2:qu1vn0
chasser:vArity2:qu0vn1
chasser:vArity2:n0vn1
chasser:vArity2:vinfn1
le:Det
le:Det
mechant:adj_pre
souris:nC

geni: [polarities] No instances of cat in [].
}}}

support and require UTF-8 input/output

We don't have the resources to deal with multiple encodings, so let's just standardise on a Unicode-friendly one and hope for the best.

This may induce some pain though, because we have legacy resources that are in ISO-8859-1

clicking 'load' in config gui has no effect

If you run GenI and just naively use the GUI to try and load a grammar or lexicon, nothing happens.

This was because we were using the old version of the state on reload.

The bigger problem is that my GUI code is a big pile of spaghetti :-(

unification bug

The code
{{{
main =
do print $ head $ unify left right
print $ head $ unify right left
where
left = map (GVar . show) [1..3]
right = drop 1 left ++ [GConst ["X"]]
}}}

The output
{{{
([?1,?1,X],fromList [("1",X),("2",X),("3",X)])
([?2,?3,X],fromList [("1",X),("2",X),("3",X)])
}}}

I think it should be

The output
{{{
([X,X,X],fromList [("1",X),("2",X),("3",X)])
([X,X,X],fromList [("1",X),("2",X),("3",X)])
}}}

The fix is probably simple: just replace after unify

listing of items seems incorrect in debugger

I could have sworn I kept seeing something like this happen.

See geni -m dist/build/grammar/valuation-sem.geni -l dist/build/lexicon/lemmas.glex -s suite/verbs --testcase=t110 and skip 140 steps

JSON format where possible

The lexicon and test suite have no business using a custom ad-hoc format. Now that I know a perfectly acceptable lightweight standard language exists for this this thing (JSON), I should just use it to make life simpler for everybody.

Backward compatibility would be nice. I guess we'll have to support both, maybe even extend geniconvert?

Macros file probably unchanged. No real gain from JSON-ifying that.
Feature structures may be tricky. We'll have to think about this a bit

bracketed output (new command line argument)

The bracketed output is a compromise between a full parse tree and a plain string.

Parse tree:
{{{
S(NP(somebody),VP(VP(V(saw),NP(something)),PP(somewhere)))
}}}

Bracketed output:
{{{
somebody ((saw something) somewhere)
}}}

Notice that we try very hard to avoid excess parentheses. The point is to make it easier for grammar hackers to understand (for example), why we get different instances of the same output. So we want to keep things as readable as possible

command line arguments in MacOS X

Right now in Leopard, I have to run geni with open geni.app, which means no command line arguments, ARGH!

At least some sort of workaround would be good

root feature in main window

And not just in configuration window...

My idea for how this should work is that it be another field in the input semantics area. I think this would take some refactoring, some kind of function that goes from config to the input semantics area and back.

unification environment

OK, it doesn't have to be a monad. But I want to have some sort of abstraction that guarantees that when I do unification on something, the results from previous unification will be automatically propagated to that thing. Seems like it should be fairly straightforward. You could just model this as a state monad for example, and have the unification function get/put the substitutions state.

What may be annoying is having to write a monad transformer and slip it into our current MT stack.

The goal is to have something that makes our code easier to write, and less error-prone, while also staying cheap (we shouldn't be doing any needless traversals).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.