syllog1sm / redshift Goto Github PK
View Code? Open in Web Editor NEWTransition-based statistical parser
Transition-based statistical parser
Hi,
I'm trying to use redshift with non-english languages. Is there any way to regenerate case file (index/english.case
) and vocab cluster file (index/bllip-clusters
)?
Thanks.
Hi
I was training the redshift for an input with some non ascii characters and I encountered errors
I passed errors by replacing them but my goal is to train it for persian data and it will surely encounter with errors
I heared about some solution like transliterals but i know nothing about
I want to khow is that the best solution or you suggest better solutions?
thanks
Anyone seen this compilation error before? I'm getting a handful of these w/ Cython 0.21.1.
redshift/parser.pyx:263:40: Cannot assign type 'void *(Pool, int, void *)' to 'init_funct_t'
Hi, I was trying to run the code, but it relies on some (model?) files. What does /tmp/stanford_beam8 mean?
Also, the script parser.py asks for a parser location directory. Which one should I specify? Apparently, the loading module looks for parser.cfg, but I couldn't find it in the distribution.
Could you improve the README? Thanks!
Running with fab make test gives me a bunch of errors
I am running on Ubuntu 14..04. How to fix this?
Hi there. I'm trying to use your POS tagger and I'm getting the following error when I attempt to train on a very small sample (10 sentences) from the Penn Treebank WSJ dataset. Any thoughts as to what I'm doing wrong?
In [2]: from redshift.tagger import train
In [3]: train(open('wsj.10.txt', 'r').read(), 'redshift_model')
---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-4-16d6fd520844> in <module>()
----> 1 train(open('wsj.10.txt', 'r').read(), 'redshift_model')
/Library/Python/2.7/site-packages/redshift/tagger.so in redshift.tagger.train (redshift/tagger.cpp:2391)()
/Library/Python/2.7/site-packages/redshift/tagger.so in redshift.tagger.Tagger.train_sent (redshift/tagger.cpp:4013)()
/Library/Python/2.7/site-packages/thinc/learner.so in thinc.learner.LinearModel.update (thinc/learner.cpp:2395)()
AssertionError:
I have been struggling to find freely available training CoNLL data for Redshift. I have finally found that using http://www.anc.org:8080/ANC2Go/ you can export Treebank in CoLNN. However the trainer fails with the following error:
Traceback (most recent call last):
File "./scripts/train.py", line 54, in
plac.call(main)
File "/home/3TOP/fscharf/virt_env/3top_dev/lib/python2.6/site-packages/plac_core.py", line 309, in call
cmd, result = parser_from(obj).consume(arglist)
File "/home/3TOP/fscharf/virt_env/3top_dev/lib/python2.6/site-packages/plac_core.py", line 195, in consume
return cmd, self.func(_(args + varargs + extraopts), *_kwargs)
File "./scripts/train.py", line 48, in main
train_data = redshift.io_parse.read_conll(train_str, unlabelled=unlabelled)
File "io_parse.pyx", line 129, in redshift.io_parse.read_conll (redshift/io_parse.cpp:2860)
ValueError: too many values to unpack (expected 4)
It looks like a format problem...
Also, is there a way to pass a folder as an argument to the trainer so all the files are used ?
So I decided to try redshift on OSX and had problem running it. Works fine on Linux.
>>> import lexicon
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named lexicon
>>> import index.lexicon
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "index/__init__.py", line 1, in <module>
import index.lexicon
ImportError: dlopen(index/lexicon.so, 2): Symbol not found: __ZNSs4_Rep20_S_empty_rep_storageE
Referenced from: index/lexicon.so
Expected in: dynamic lookup
Quick googling, found out that I need to link against libstdc++
. Apply the following patch, rebuild.
diff --git a/setup.py b/setup.py
index 3660b56..cf91b5a 100755
--- a/setup.py
+++ b/setup.py
@@ -61,6 +61,7 @@ exts = [
include_dirs=includes),
Extension("index.lexicon", ["index/lexicon.pyx", "ext/MurmurHash2.cpp",
"ext/MurmurHash3.cpp"], language="c++",
+ extra_link_args=['-lstdc++'],
include_dirs=includes),
Extension("features.extractor", ["features/extractor.pyx", "ext/MurmurHash2.cpp",
Greetings!
I'm really excited to start using this, however I'm not able to install it on my Mac. Here is a link to the message I'm getting: http://pastebin.com/j9cNzdv7. I have followed your installation instructions, but I'm still not able to get it to work. Thanks so much for developing this. I can't wait to get to use it!
Hello! I am using redshift for the first time, and I am unable to find any documentation regarding the functionalities of the libraries. Can you please tell me where I can start with redshift?
Also, can you tell me what /tmp/stanford_beam8 is?
Requires figuring out what to do about the dependency on sparsehash.
Hello, I am interested in your joint dependency parser with disfluency detection.
I want to try running it, but it fails, even though I read the previous issue and
paid attention to the version pinning while installing.
I am trying to install it on a machine with ubuntu 12.04.
What I did in installing is almost the same recipe you wrote in installation in README.rst,
But just I did "git checkout develop" not after cloning, but almost in the last part of the installation,
just before "fab make test",
because doing "git checkout develop" strips away information about the version pinning.
But this ends up with an error with message:
index/lexicon.cpp:249:36: fatal error: murmurhash/MurmurHash3.h: No such file or directory
#include "murmurhash/MurmurHash3.h"
^
compilation terminated.
error: command 'gcc' failed with exit status 1
Fatal error: local() encountered an error (return code 1) while executing 'python setup.py build_ext --inplace'
Aborting.
Though I changed to do "git checkout" as explained, and replace the requirements.txt with
the one with the version pinnings before "pip install -r requirements.txt", I got the same error.
How should I solve this problem??
Hi there! This isn't an issue, so much as a question.
I came across your neat POS tagger implementation by way of this blog post:
I'm curious... In the post, you describe the greedy implementation and argue that it's plenty accurate, but the implementation here in your repo actually uses a beam search. Do you have accuracy numbers for this implementation?
Add raw text interface.
i tried every possibility to install redshift but no luck.
i get this error every time:
"error: command 'gcc' failed with exit status 1
Fatal error: local() encountered an error (return code 1) while executing 'python setup.py build_ext --inplace'
"
i guess problem is with cython,
Hello, thank you for your amazing work on redshift , It's a glass of water in NLP hell , I'm planning on using in WSD research but I keep failing to install it on my machine , with the errors logged on the joined pastebin dump Here . I hope you will find the time to help me .
Thanks in advance,
Amine
Hi mate,
I couldn't find the way to set/replace sentence.Input token label, so I made a workaround in sentence.pyx
def set_label(self, i, label):
self.c_sent.tokens[i].label = index.hashes.encode_label(label)
I wonder is there better way to do that?
Currently the parser initialisation starts up index.hashes, so if you read input before you make the parser, the indexing is off.
I just noticed that there still isn't a license listed with your code, which really prevents a lot of people from using it.
Also, you list some conflicting goals (FOSS, but no commercial use). There's a great Quora answer on the topic of how you could make it free-for-some, but that would inherently not be open source.
If you want to go for something truly open source, but can't pick a license, choosealicense.com will help you out.
Should be faster, require less code.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.