Comments (3)
Another issue: attachment of modifiers, suppose a phrase like
"each portion of a building separated by walls"
In dt
, I get these two options:
#1
AdjCN
( AdvCN ( UseN portion_N )
( PrepNP of_Prep
( DetCN ( DetQuant IndefArt NumSg ) ( UseN building_N ) )
)
)
( PassVAgent separate_V
( DetCN (DetQuant IndefArt NumPl) ( UseN wall_N ) )
): CN[2,3,4,5,6,8]
#LIN: "portion of a building separated by walls"
#2
AdvNP
( DetCN each_Det
( AdjCN ( UseN portion_N )
( PassVAgent separate_V
( DetCN (DetQuant IndefArt NumPl) ( UseN wall_N ) )
)
)
( PrepNP of_Prep
( DetCN ( DetQuant IndefArt NumSg ) ( UseN building_N ) )
): NP[1,2,3,4,5,6,8]
#LIN: "portion separated by walls of a building"
However, dt
doesn't contain the NP version of 1, which would be just to apply DetCN each_Det
on that tree. I wonder if some pruning step removes the NP version of 1, because it covers as many words as 2? (I tried to run the example without pruneDevTree
, but the particular sentence is very long and the program was taking a long time. If you think that might be the reason, I can produce a shorter version of the sentence and try again.)
In any case, I can only imagine that the NP-version of 1 would also be constructed, but it's thrown away before it can be prioritised. And I would like to prioritise it, because the attachment matches the word order: both "building" and "walls" are children of "portion", but in 1, building is more immediately attached.
from gf-ud.
I can solve the particular case with an #auxfun that says, every time when a NOUN has an acl
and nmod
child, put nmod
before acl
. But this is not ideal for scalability.
With an explicit DISTANCE=-1*
or similar, I could duplicate that rule to say that whatever is closer to the head in the original word order, gets attached first in the tree. This is tedious, but finite: there are finite amount of relations, and finite combinations that appear together in real life texts.
Could one make a more fundamental change in the algorithm that wouldn't require explicit instructions about word order? Like ranking higher trees whose subtrees are attached according to distance in the original string. I don't know if this is feasible at all/requires too much rewriting. I can get by with auxfuns, just thinking aloud here.
from gf-ud.
Here's a conllu file to test with
1 Each each DET DT _ 2 det _ _
2 portion portion NOUN NN Number=Sing 10 nsubj _ _
3 of of ADP IN _ 5 case _ _
4 a a DET DT Definite=Ind|PronType=Art 5 det _ _
5 building building NOUN NN Number=Sing 2 nmod _ _
6 separated separate VERB VBN Tense=Past|VerbForm=Part 2 acl _ _
7 by by ADP IN _ 8 case _ _
8 walls wall NOUN NNS Number=Plur 6 obl _ _
9 is be AUX VBZ Mood=Ind|Number=Sing|Person=3|Tense=Pres|VerbForm=Fin 10 cop _ _
10 separate separate ADJ JJ Degree=Pos 0 root _ SpacesAfter=\n
from gf-ud.
Related Issues (13)
- Infinite applications of ProgrVP by ud2gf HOT 4
- Feature request: match lexicon in auxfuns
- Feature request: Modular labels files HOT 1
- Feature request: different backup options
- Feature request: command line option to opt for string literals for OOV words
- FORM and LEMMA should accept comma HOT 1
- Feature request: pattern matching also on XPOS and MISC HOT 7
- CoNNL-U Plus HOT 1
- Feature request: output something that is like bt0 but macros expanded on subtrees
- conll2latex not working? or am I misunderstading something? HOT 1
- Feature request: handle compounds in lemma
- Visualization too small for long sentences HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from gf-ud.