Giter Club home page Giter Club logo

preprocess-conll05's Introduction

preprocess-conll05

Scripts for preprocessing the CoNLL-2005 SRL dataset.

Requirements:

Basic CoNLL-2005 pre-processing

These pre-processing steps download the CoNLL-2005 data and gather gold part-of-speech and parse info from your copy of the PTB. The output will look like:

The         DT    (S(NP-SBJ-1(NP*  *    -   -      (A1*      
economy     NN    *                *    -   -      *      
's          POS   *)               *    -   -      *      
temperature NN    *)               *    -   -      *)     
will        MD    (VP*             *    -   -      (AM-MOD*)     
be          VB    (VP*             *    -   -      *      
taken       VBN   (VP*             *    01  take   (V*) 
  • Field 1: word form
  • Field 2: gold part-of-speech tag
  • Field 3: gold sytax
  • Field 4: placeholder
  • Field 5: verb sense
  • Field 6: predicate (infinitive form)
  • Field 7+: for each predicate, a column representing the labeled arguments of the predicate.

First, set up paths to existing data:

export WSJ="/your/path/to/wsj/"
export BROWN="/your/path/to/brown"

Download CoNLL-2005 data and scripts:

./bin/basic/get_data.sh

Extract pos/parse info from gold data:

./bin/basic/extract_train_from_ptb.sh
./bin/basic/extract_dev_from_ptb.sh
./bin/basic/extract_test_from_ptb.sh
./bin/basic/extract_test_from_brown.sh

Format into combined output files:

./bin/basic/make-trainset.sh
./bin/basic/make-devset.sh 
./bin/basic/make-wsj-test.sh
./bin/basic/make-brown-test.sh 

Further pre-processing (e.g. for LISA)

Sometimes it's nice to convert constituencies to dependency parses and provide automatic part-of-speech tags, e.g. if you wish to train a parsing model. BIO format is also a more standard way of representing spans than the default CoNLL-2005 format. This pre-processing converts the constituency parses to Stanford dependencies (v3.5), assigns automatic part-of-speech tags from the Stanford left3words tagger, and converts SRL spans to BIO format. The output will look like:

conll05 0       0       The         DT      DT      2       det         _       -       -       -       -       O       B-A1
conll05 0       1       economy     NN      NN      4       poss        _       -       -       -       -       O       I-A1
conll05 0       2       's          POS     POS     2       possessive  _       -       -       -       -       O       I-A1
conll05 0       3       temperature NN      NN      7       nsubjpass   _       -       -       -       -       O       I-A1
conll05 0       4       will        MD      MD      7       aux         _       -       -       -       -       O       B-AM-MOD
conll05 0       5       be          VB      VB      7       auxpass     _       -       -       -       -       O       O
conll05 0       6       taken       VBN     VBN     0       root        _       01      take    -       -       O       B-V
  • Field 1: domain placeholder
  • Field 2: sentence id
  • Field 3: token id
  • Field 4: word form
  • Field 5: gold part-of-speech tag
  • Field 6: auto part-of-speech tag
  • Field 7: dependency parse head
  • Field 8: dependency parse label
  • Field 9: placeholder
  • Field 10: verb sense
  • Field 11: predicate (infinitive form)
  • Field 12: placeholder
  • Field 13: placeholder
  • Field 14: NER placeholder
  • Field 15+: for each predicate, a column representing the labeled arguments of the predicate.

First, set up paths to Stanford parser and part-of-speech tagger:

export STANFORD_PARSER="/your/path/to/stanford-parser-full-2017-06-09"
export STANFORD_POS="/your/path/to/stanford-postagger-full-2017-06-09"

The following script will then convert dependencies, tag, and reformat the data. This will create a new file in the $CONLL05 directory with the same name as the input and suffix .parse.sdeps.combined. If $CONLL05 is not set, you should set it to the conll05st-release directory.

./bin/preprocess_conll05_sdeps.sh $CONLL05/train-set.gz
./bin/preprocess_conll05_sdeps.sh $CONLL05/dev-set.gz
./bin/preprocess_conll05_sdeps.sh $CONLL05/test.wsj.gz
./bin/preprocess_conll05_sdeps.sh $CONLL05/test.brown.gz

Now all that remains is to convert fields to BIO format. The following script will create a new file in the same directory as the old file with the suffix .bio:

./bin/convert-bio.sh $CONLL05/train-set.gz.parse.sdeps.combined
./bin/convert-bio.sh $CONLL05/dev-set.gz.parse.sdeps.combined
./bin/convert-bio.sh $CONLL05/test.wsj.gz.parse.sdeps.combined
./bin/convert-bio.sh $CONLL05/test.brown.gz.parse.sdeps.combined

You may also want to generate a matrix of transition probabilities for performing Viterbi inference at test time. You can use the following to do so:

python3 bin/compute_transition_probs.py --in_file_name $CONLL05/train-set.gz.parse.sdeps.combined.bio > $CONLL05/transition_probs.tsv

Pre-processing for evaluation scripts

To evaluate using the CoNLL eval.pl and srl-eval.pl scripts, you'll need files in a different format to evaluate against. To generate files for parse evaluation (eval.pl), use the following script:

python3 bin/eval/extract_conll_parse_file.py --input_file $CONLL05/dev-set.gz.parse.sdeps.combined --id_field 2 --word_field 3 --pos_field 4 --head_field 6 --label_field 7 > $CONLL05/conll2005-dev-gold-parse.txt

For SRL evaluation, use the following:

python3 bin/eval/extract_conll_prop_file.py --input_file $CONLL05/dev-set.gz.parse.sdeps.combined --take_last --word_field 3 --pred_field 10 --first_prop_field 14 > $CONLL05/conll2005-dev-gold-props.txt

preprocess-conll05's People

Contributors

galtay avatar strubell avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

preprocess-conll05's Issues

Minor Bugfixes / Suggestions for the README

Thanks for setting this up, really makes life easier for preprocessing.

It might be useful to say that the wsj path points to the mrg part of the parsed corpus in PTB, something like

WSJ=.../LDC99T42/treebank_3/parsed/mrg/wsj/

There are other options like parsed/prd/wsj, which do not work with the script provided.

The below exports in file ./bin/basic/get_data.sh do not have a effect once the script terminates,
they should be done outside, like the WSJ and BROWN exports, so that other scripts still see these variables.

export SRLCONLL="`pwd`/srlconll-1.1"
export CONLL05="`pwd`/conll05st-release"
export PERL5LIB=$SRLCONLL/lib:$PERL5LIB

Also, where should I get the brown corpus from? I used the one that comes with NLTK, but that does not contain some files (see error below)

cat: .../corpora/brown/CK/CK01.MRG: No such file or directory
cat: .../corpora/brown/CK/CK02.MRG: No such file or directory
cat: .../corpora/brown/CK/CK03.MRG: No such file or directory

Thanks for the help!

got java.lang.IllegalArgumentException when generating parse.sdeps.combined file

hi, right now I am following this repository. Everything goes fine except that when generating parse.sdeps.combined file, I ran into a java exception throw by the stanford package. The parse.sdeps.combined file is generated, though.


Exception in thread "main" java.lang.IllegalArgumentException: No head rule defined for NP* using class edu.stanford.nlp.trees.SemanticHeadFinder in (NP*
.
(NNP Oc)
CD
19
*
NN
review
*)

at edu.stanford.nlp.trees.AbstractCollinsHeadFinder.determineNonTrivialHead(AbstractCollinsHeadFinder.java:246)
at edu.stanford.nlp.trees.SemanticHeadFinder.determineNonTrivialHead(SemanticHeadFinder.java:452)
at edu.stanford.nlp.trees.AbstractCollinsHeadFinder.determineHead(AbstractCollinsHeadFinder.java:193)
at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:319)
at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:317)
at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:317)
at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:317)
at edu.stanford.nlp.trees.TreeGraphNode.percolateHeads(TreeGraphNode.java:317)
at edu.stanford.nlp.trees.GrammaticalStructure.<init>(GrammaticalStructure.java:184)
at edu.stanford.nlp.trees.EnglishGrammaticalStructure.<init>(EnglishGrammaticalStructure.java:86)
at edu.stanford.nlp.parser.lexparser.EnglishTreebankParserParams.getGrammaticalStructure(EnglishTreebankParserParams.java:2355)
at edu.stanford.nlp.trees.GrammaticalStructureConversionUtils$TreeBankGrammaticalStructureWrapper$GsIterator.primeGs(GrammaticalStructureConversionUtils.java:410)
at edu.stanford.nlp.trees.GrammaticalStructureConversionUtils$TreeBankGrammaticalStructureWrapper$GsIterator.<init>(GrammaticalStructureConversionUtils.java:397)
at edu.stanford.nlp.trees.GrammaticalStructureConversionUtils$TreeBankGrammaticalStructureWrapper.iterator(GrammaticalStructureConversionUtils.java:373)
at edu.stanford.nlp.trees.GrammaticalStructureConversionUtils.convertTrees(GrammaticalStructureConversionUtils.java:739)
at edu.stanford.nlp.trees.EnglishGrammaticalStructure.main(EnglishGrammaticalStructure.java:2219)

POS tagging: /Users/work/Workspace/git/preprocess-conll05/conll05st-release/train-set.gz.parse.sdeps
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.


Above is the error message I got. I am wondering if it's OK to ignore this message since the parse.sdeps.combined file is generated. If not, how can I solve it. Thanks for your help in advance !

Where do test dirs props, null, and ne come from?

Hi!

I noticed in make-wsj-test.sh and make-brown-test.sh that we try to zcat a props, null, and ne file from test.wsj. However, in the extract_test_from_ptb.sh and extract_test_from_brown.sh scripts, none of these dirs/files are generated. Where are these supposed to come from?

Thanks!

Problems building Brown test set

The preprocessing steps seem to work for all of the WSJ data, but I'm running into some issues with the Brown test set. It might be a version issue with my Penn Treebank data and/or stanford parser, but I'm curious if anyone else has had the same issue. A specific example is available starting on line 743 of the file,

LDC99T42/treebank_3/parsed/mrg/brown/ck/ck02.mrg

( (SQ                                                                                                                                                                                                                                           
    (NP-SBJ (-NONE- *) )                                                                                                                                                                                                                        
    (VP (VB Remember)                                                                                                                                                                                                                           
      (SBAR                                                                                                                                                                                                                                     
        (WHNP-1 (WP what) )                                                                                                                                                                                                                     
        (S                                                                                                                                                                                                                                      
          (NP-SBJ (PRP I) )                                                                                                                                                                                                                     
          (VP (VBD said)                                                                                                                                                                                                                        
            (NP (-NONE- *T*-1) )                                                                                                                                                                                                                
            (PP (IN about)                                                                                                                                                                                                                      
              (S-NOM                                                                                                                                                                                                                            
                (NP-SBJ (-NONE- *) )                                                                                                                                                                                                            
                (VP (VBG going)                                                                                                                                                                                                                 
                  (ADVP-DIR (RP out) )                                                                                                                                                                                                          
                  (S-PRP                                                                                                                                                                                                                        
                    (NP-SBJ (-NONE- *) )                                                                                                                                                                                                        
                    (VP (TO to)                                                                                                                                                                                                                 
                      (VP (VB get)                                                                                                                                                                                                              
                        (NP                                                                                                                                                                                                                     
                          (NP (NN anybody) )                                                                                                                                                                                                    
                          (VP (VBN left)                                                                                                                                                                                                        
                            (ADVP (IN behind) ))))))))))))))                                                                                                                                                                                    
  (. ?) (. ?) )
( (S                                                                                                                                                                         
    (NP-SBJ (DT That) )                                                                                                                                                      
    (ADVP-TMP (RB still) )                                                                                                                                                   
    (VP (VBZ holds) )                                                                                                                                                        
    (. .) ))                                                                                                                                                                 
( (S                                                                                                                                                                         
    (NP-SBJ (PRP We) )                                                                                                                                                       
    (VP (VBP bring)                                                                                                                                                          
      (ADVP-DIR (RB back) )                                                                                                                                                  
      (NP                                                                                                                                                                    
        (NP (DT all) )                                                                                                                                                       
        (ADJP (JJ dead)                                                                                                                                                      
          (CC and)                                                                                                                                                           
          (VBN wounded) )))                                                                                                                                                  
    ('' '') (. .) ))

Note that in the first sentence (which ends in double question marks) the outermost parentheses enclose the entire sentence.

The syntax parse here (I replaced the awk statement with awk '!/^\*x\*/ {print}'),

produces

$CONLL05/test.brown/synt/test.brown.synt.gz

VB                         (SQ(VP*                                          
WP                   (SBAR(WHNP-1*)                                         
PRP                     (S(NP-SBJ*)                                         
VBD                           (VP*                                          
IN                            (PP*                                          
VBG                     (S-NOM(VP*                                          
RP                      (ADVP-DIR*)                                         
TO                      (S-PRP(VP*                                          
VB                            (VP*                                          
NN                         (NP(NP*)                                         
VBN                           (VP*                                          
IN                          (ADVP*))))))))))))))                            
.                                *                                          
.                                * 

DT                      (S(NP-SBJ*)                                                                                                                                          
RB                      (ADVP-TMP*)                                                                                                                                          
VBZ                           (VP*)                                                                                                                                          
.                                *)                                                                                                                                          
                                                                                                                                                                             
PRP                     (S(NP-SBJ*)                                                                                                                                          
VBP                           (VP*                                                                                                                                           
RB                      (ADVP-DIR*)                                                                                                                                          
DT                         (NP(NP*)                                                                                                                                          
JJ                          (ADJP*                                                                                                                                           
CC                               *                                                                                                                                           
VBN                              *)))                                                                                                                                        
''                               *                                                                                                                                           
.                                *)

Note that the elements representing the question marks are no longer contained within the parentheses. Next we run,

$CONLL05/test.brown.gz

Remember  VB                         (SQ(VP*                            *    -  remember    (V*)         *            *        *            *                   
what      WP                   (SBAR(WHNP-1*)                           *    -  -          (A1*     (R-A1*)           *        *            *                   
I         PRP                     (S(NP-SBJ*)                           *    -  -             *       (A0*)           *        *            *                   
said      VBD                           (VP*                            *    -  say           *        (V*)           *        *            *                   
about     IN                            (PP*                            *    -  -             *       (A3*            *        *            *                   
going     VBG                     (S-NOM(VP*                            *    -  go            *          *          (V*)       *            *                   
out       RP                      (ADVP-DIR*)                           *    -  -             *          *     (AM-DIR*)       *            *                   
to        TO                      (S-PRP(VP*                            *    -  -             *          *     (AM-PNC*        *            *                   
get       VB                            (VP*                            *    -  get           *          *            *      (V*)           *                   
anybody   NN                         (NP(NP*)                           *    -  -             *          *            *     (A1*         (A0*)                  
left      VBN                           (VP*                            *    -  leave         *          *            *        *          (V*)                  
behind    IN                          (ADVP*))))))))))))))              *    -  -             *)         *)           *)       *)    (AM-ADV*)                  
?         .                                *                            *    -  -             *          *            *        *            *                   
?         .                                *                            *    -  -             *          *            *        *            *

That   DT                      (S(NP-SBJ*)                           *    -  -          (A1*)                                                                                
still  RB                      (ADVP-TMP*)                           *    -  -      (AM-TMP*)                                                                                
holds  VBZ                           (VP*)                           *    -  hold        (V*)                                                                                
.      .                                *)                           *    -  -             *                                                                                 
                                                                                                                                                                             
We       PRP                     (S(NP-SBJ*)                           *    -  -           (A0*)                                                                             
bring    VBP                           (VP*                            *    -  bring        (V*)                                                                             
back     RB                      (ADVP-DIR*)                           *    -  -       (AM-DIR*)                                                                             
all      DT                         (NP(NP*)                           *    -  -           (A1*                                                                              
dead     JJ                          (ADJP*                            *    -  -              *                                                                              
and      CC                               *                            *    -  -              *                                                                              
wounded  VBN                              *)))                         *    -  -              *)                                                                             
''       ''                               *                            *    -  -              *                                                                              
.        .                                *)                           *    -  -              * 

When we continue and run the,

script with $CONLL05/test.brown.gz as input we get a series of outputs like this,

$CONLL05/test.brown.gz.parse

(from applying awk and sed commands to the input file $CONLL05/test.brown.gz)

(SQ(VP(VB Remember)                                                                                                                                                           
(SBAR(WHNP-1(WP what))                                                                                                                                                        
(S(NP-SBJ(PRP I))                                                                                                                                                             
(VP(VBD said)                                                                                                                                                                 
(PP(IN about)                                                                                                                                                                 
(S-NOM(VP(VBG going)                                                                                                                                                          
(ADVP-DIR(RP out))                                                                                                                                                            
(S-PRP(VP(TO to)                                                                                                                                                              
(VP(VB get)                                                                                                                                                                   
(NP(NP(NN anybody))                                                                                                                                                           
(VP(VBN left)                                                                                                                                                                 
(ADVP(IN behind)))))))))))))))                                                                                                                                                
(. ?)                                                                                                                                                                         
(. ?)

(S(NP-SBJ(DT That))                                                                                                                                                          
(ADVP-TMP(RB still))                                                                                                                                                         
(VP(VBZ holds))                                                                                                                                                              
(. .))                                                                                                                                                                       
                                                                                                                                                                             
(S(NP-SBJ(PRP We))                                                                                                                                                           
(VP(VBP bring)                                                                                                                                                               
(ADVP-DIR(RB back))                                                                                                                                                          
(NP(NP(DT all))                                                                                                                                                              
(ADJP(JJ dead)                                                                                                                                                               
(CC and)                                                                                                                                                                     
(VBN wounded))))                                                                                                                                                             
('' '')                                                                                                                                                                      
(. .))

$CONLL05/test.brown.gz.parse.sdeps

(from applying the Standford parser to $CONLL05/test.brown.gz.parse )

1       Remember        _       VERB    VB      _       0       root    _       _                                                                               
2       what    _       PRON    WP      _       4       dobj    _       _                                                                                       
3       I       _       PRON    PRP     _       4       nsubj   _       _                                                                                       
4       said    _       VERB    VBD     _       1       ccomp   _       _                                                                                       
5       about   _       SCONJ   IN      _       4       prep    _       _                                                                                       
6       going   _       VERB    VBG     _       5       pcomp   _       _                                                                                       
7       out     _       ADP     RP      _       6       advmod  _       _                                                                                       
8       to      _       PART    TO      _       9       aux     _       _                                                                                       
9       get     _       VERB    VB      _       6       xcomp   _       _                                                                                       
10      anybody _       PRON    NN      _       9       dobj    _       _                                                                                       
11      left    _       VERB    VBN     _       10      vmod    _       _                                                                                       
12      behind  _       ADP     IN      _       11      advmod  _       _                                                                                       
                                                                                                                                                                
1       ?       _       PUNCT   .       _       0       root    _       _                                                                                       
                                                                                                                                                                
1       ?       _       PUNCT   .       _       0       root    _       _

1       That    _       PRON    DT      _       3       nsubj   _       _                                                                                                    
2       still   _       ADV     RB      _       3       advmod  _       _                                                                                                    
3       holds   _       VERB    VBZ     _       0       root    _       _                                                                                                    
4       .       _       PUNCT   .       _       3       punct   _       _                                                                                                    
                                                                                                                                                                             
1       We      _       PRON    PRP     _       2       nsubj   _       _                                                                                                    
2       bring   _       VERB    VBP     _       0       root    _       _                                                                                                    
3       back    _       ADV     RB      _       2       advmod  _       _                                                                                                    
4       all     _       DET     DT      _       2       dobj    _       _                                                                                                    
5       dead    _       ADJ     JJ      _       4       amod    _       _                                                                                                    
6       and     _       CONJ    CC      _       5       cc      _       _                                                                                                    
7       wounded _       VERB    VBN     _       5       conj    _       _                                                                                                    
8       ''      _       PUNCT   ''      _       2       punct   _       _                                                                                                    
9       .       _       PUNCT   .       _       2       punct   _       _ 

Note that the question marks have been put on their own lines here.

$CONLL05/test.brown.gz.parse.sdeps.posonly

(from applying awk to $CONLL05/test.brown.gz.parse.sdeps)

Remember what I said about going out to get anybody left behind                                                                                                              
?                                                                                                                                                                            
?                                                                                                                                                                            
That still holds .                                                                                                                                                           
We bring back all dead and wounded '' .

$CONLL05/test.brown.gz.parse.sdeps.pos

(from applying edu.stanford.nlp.tagger.maxent.MaxentTagger to $CONLL05/test.brown.gz.parse.sdeps.posonly)

Remember        VB                                                                                                                                                           
what    WP                                                                                                                                                                   
I       PRP                                                                                                                                                                  
said    VBD                                                                                                                                                                  
about   IN                                                                                                                                                                   
going   VBG                                                                                                                                                                  
out     RP                                                                                                                                                                   
to      TO                                                                                                                                                                   
get     VB                                                                                                                                                                   
anybody NN                                                                                                                                                                   
left    VBD                                                                                                                                                                  
behind  IN                                                                                                                                                                   
                                                                                                                                                                             
?       .                                                                                                                                                                    
                                                                                                                                                                             
?       .                                                                                                                                                                    
                                                                                                                                                                             
That    DT                                                                                                                                                                   
still   RB                                                                                                                                                                   
holds   VBZ                                                                                                                                                                  
.       .                                                                                                                                                                    
                                                                                                                                                                             
We      PRP                                                                                                                                                                  
bring   VBP                                                                                                                                                                  
back    RP                                                                                                                                                                   
all     DT                                                                                                                                                                   
dead    JJ                                                                                                                                                                   
and     CC                                                                                                                                                                   
wounded VBN                                                                                                                                                                  
''      ''                                                                                                                                                                   
.       .                                                                                                                                                                    

$CONLL05/test.brown.gz.parse.sdeps.combined

from applying the paste command to

  • f_converted = $CONLL05/test.brown.gz.parse.sdeps
  • f_pos = $CONLL05/test.brown.gz.parse.sdeps.pos
conll05 200     0       Remember        VB      VB      0       root    _       -       remember        -       -       *       (V*)    *       *       *       *            
conll05 200     1       what    WP      WP      4       dobj    _       -       -       -       -       *       (A1*    (R-A1*) *       *       *                            
conll05 200     2       I       PRP     PRP     4       nsubj   _       -       -       -       -       *       *       (A0*)   *       *       *                            
conll05 200     3       said    VBD     VBD     1       ccomp   _       -       say     -       -       *       *       (V*)    *       *       *                            
conll05 200     4       about   IN      IN      4       prep    _       -       -       -       -       *       *       (A3*    *       *       *                            
conll05 200     5       going   VBG     VBG     5       pcomp   _       -       go      -       -       *       *       *       (V*)    *       *                            
conll05 200     6       out     RP      RP      6       advmod  _       -       -       -       -       *       *       *       (AM-DIR*)       *       *                    
conll05 200     7       to      TO      TO      9       aux     _       -       -       -       -       *       *       *       (AM-PNC*        *       *                    
conll05 200     8       get     VB      VB      6       xcomp   _       -       get     -       -       *       *       *       *       (V*)    *                            
conll05 200     9       anybody NN      NN      9       dobj    _       -       -       -       -       *       *       *       *       (A1*    (A0*)                        
conll05 200     10      left    VBN     VBD     10      vmod    _       -       leave   -       -       *       *       *       *       *       (V*)                         
conll05 200     11      behind  IN      IN      11      advmod  _       -       -       -       -       *       *)      *)      *)      *)      (AM-ADV*)                    
conll05 200     12      ?                               -       -       -       -       *       *       *       *       *       *                                            
conll05 200     13      ?       .       .       0       root    _       -       -       -       -       *       *       *       *       *       *                            
                                                                                                                                                                             
conll05 201     0       That    .       .       0       root    _       -       -       -       -       *       (A1*)                                                        
conll05 201     1       still                           -       -       -       -       *       (AM-TMP*)                                                                    
conll05 201     2       holds   DT      DT      3       nsubj   _       -       hold    -       -       *       (V*)                                                         
conll05 201     3       .       RB      RB      3       advmod  _       -       -       -       -       *       *                                                            
        VBZ     VBZ     0       root    _                                                                                                                                    
conll05 202     0       We      .       .       3       punct   _       -       -       -       -       *       (A0*)                                                        
conll05 202     1       bring                           -       bring   -       -       *       (V*)                                                                         
conll05 202     2       back    PRP     PRP     2       nsubj   _       -       -       -       -       *       (AM-DIR*)                                                    
conll05 202     3       all     VBP     VBP     0       root    _       -       -       -       -       *       (A1*                                                         
conll05 202     4       dead    RB      RP      2       advmod  _       -       -       -       -       *       *                                                            
conll05 202     5       and     DT      DT      2       dobj    _       -       -       -       -       *       *                                                            
conll05 202     6       wounded JJ      JJ      4       amod    _       -       -       -       -       *       *)                                                           
conll05 202     7       ''      CC      CC      5       cc      _       -       -       -       -       *       *                                                            
conll05 202     8       .       VBN     VBN     5       conj    _       -       -       -       -       *       *                                                            
        ''      ''      2       punct   _

Well that was a long read! We see the problem here with the two lines containing,

VBZ     VBZ     0       root    _

and

''      ''      2       punct   _ 

This pattern continues to cause problems further down the file. Wondering if anyone else ran into this problem and found a solution?

@strubell ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.