Comments (28)
request works, but its not return data
It does return data: it says that the word "пёс" is not present in the model vocabulary. Which is entirely true, because the ruwikiruscorpora_upos_skipgram_300_2_2019
model contains words with PoS tags. Thus, you should query for "пёс_NOUN", not for simple "пёс".
from webvectors.
Ah, I see now.
The model format was incorrectly recognized. Update your WebVectors version and try again, I've just fixed this in 6a558ff
Please report whether the problem is gone.
from webvectors.
There seems to be two different issues.
Are you getting errors right after you open the service in a web browser, even before you actually send a query word?
What errors?
from webvectors.
Why are you trying to send HTTP requests from a browser to the word2vec_server
port?
You should run a proper HTTP server (either apache
or gunicorn
), as described in the WebVectors readme ('Running WebVectors' section).
We generally recommend gunicorn
.
from webvectors.
https://github.com/ahvahsky2008/test my configs here
I see that you changed detect_tag
to True
. If you want the service to automatically detect the query part of speech, you should make sure to run a properly configured UDPipe or Stanford CoreNLP server (more details in https://github.com/akutuzov/webvectors/blob/master/lemmatizer.py).
It looks like you did not do this. That's the reason your WebVectors instance attempts to access the (non-existent) tagger service, and eventually timeouts.
If you don't want to setup a tagger service, simply return the detect_tag
field in the config file to its default `False' state.
from webvectors.
What's in your models.tsv
?
from webvectors.
from webvectors.
identifier name path string default tags algo size
ruscorpora_upos_skipgram_300_5_2018 Russian National Corpus /var/www/model/model.bin similar4 False True word2vec 250000000
from webvectors.
i try models.bin, models.txt - same issue
from webvectors.
@akutuzov thx, one problem solved. But server not started fully
when i open url http://xxx.xxx.xxx.xxx:8088/ its show errors
from webvectors.
FIrstly i run service
python3.7 word2vec_server.py
when i open url in browser x.x.x.x:8088
in console i see errors
Model ruscorpora_upos_skipgram_300_5_2018 from file /var/www/model/model.bin loaded successfully.
Socket created
Socket bind complete
Socket now listening on port 8088
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python3.7/threading.py", line 926, in _bootstrap_inner
self.run()
File "word2vec_server.py", line 22, in run
clientthread(self.connect, self.address)
File "word2vec_server.py", line 35, in clientthread
query = json.loads(data.decode('utf-8'))
File "/usr/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
from webvectors.
@akutuzov sorry for tupnyak))
when i try click "Найти похожие слова" its freezes and unicorn show error
from webvectors.
https://github.com/ahvahsky2008/test my configs here
from webvectors.
[2020-02-27 08:31:34 +0100] [22399] [DEBUG] Current configuration:
config: None
bind: ['0.0.0.0:9999']
backlog: 2048
workers: 1
worker_class: sync
threads: 1
worker_connections: 1000
max_requests: 0
max_requests_jitter: 0
timeout: 30
graceful_timeout: 30
keepalive: 2
limit_request_line: 4094
limit_request_fields: 100
limit_request_field_size: 8190
reload: False
reload_engine: auto
reload_extra_files: []
spew: False
check_config: False
preload_app: False
sendfile: None
reuse_port: False
chdir: /var/www/webvectors
daemon: False
raw_env: []
pidfile: None
worker_tmp_dir: None
user: 0
group: 0
umask: 0
initgroups: False
tmp_upload_dir: None
secure_scheme_headers: {'X-FORWARDED-PROTOCOL': 'ssl', 'X-FORWARDED-PROTO': 'https', 'X-FORWARDED-SSL': 'on'}
forwarded_allow_ips: ['127.0.0.1']
accesslog: gunicorn.log
disable_redirect_access_to_syslog: False
access_log_format: %(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s"
errorlog: gunicorn.error.log
loglevel: debug
capture_output: True
logger_class: gunicorn.glogging.Logger
logconfig: None
logconfig_dict: {}
syslog_addr: udp://localhost:514
syslog: False
syslog_prefix: None
syslog_facility: user
enable_stdio_inheritance: False
statsd_host: None
dogstatsd_tags:
statsd_prefix:
proc_name: None
default_proc_name: run_syn:app_syn
pythonpath: None
paste: None
on_starting: <function OnStarting.on_starting at 0x7ff4231860e0>
on_reload: <function OnReload.on_reload at 0x7ff423186200>
when_ready: <function WhenReady.when_ready at 0x7ff423186320>
pre_fork: <function Prefork.pre_fork at 0x7ff423186440>
post_fork: <function Postfork.post_fork at 0x7ff423186560>
post_worker_init: <function PostWorkerInit.post_worker_init at 0x7ff423186680>
worker_int: <function WorkerInt.worker_int at 0x7ff4231867a0>
worker_abort: <function WorkerAbort.worker_abort at 0x7ff4231868c0>
pre_exec: <function PreExec.pre_exec at 0x7ff4231869e0>
pre_request: <function PreRequest.pre_request at 0x7ff423186b00>
post_request: <function PostRequest.post_request at 0x7ff423186b90>
child_exit: <function ChildExit.child_exit at 0x7ff423186cb0>
worker_exit: <function WorkerExit.worker_exit at 0x7ff423186dd0>
nworkers_changed: <function NumWorkersChanged.nworkers_changed at 0x7ff423186ef0>
on_exit: <function OnExit.on_exit at 0x7ff42318b050>
proxy_protocol: False
proxy_allow_ips: ['127.0.0.1']
keyfile: None
certfile: None
ssl_version: 2
cert_reqs: 0
ca_certs: None
suppress_ragged_eofs: True
do_handshake_on_connect: False
ciphers: None
raw_paste_global_conf: []
strip_header_spaces: False
[2020-02-27 08:31:34 +0100] [22399] [INFO] Starting gunicorn 20.0.4
[2020-02-27 08:31:34 +0100] [22399] [DEBUG] Arbiter booted
[2020-02-27 08:31:34 +0100] [22399] [INFO] Listening at: http://0.0.0.0:9999 (22399)
[2020-02-27 08:31:34 +0100] [22399] [INFO] Using worker: sync
[2020-02-27 08:31:34 +0100] [22402] [INFO] Booting worker with pid: 22402
[2020-02-27 08:31:34 +0100] [22399] [DEBUG] 1 workers
[2020-02-27 08:31:41 +0100] [22402] [DEBUG] GET /en/associates/
[2020-02-27 08:31:42 +0100] [22402] [DEBUG] GET /en/associates/YOUR_URL/example_vocab.json
[2020-02-27 08:31:44 +0100] [22402] [DEBUG] POST /en/associates/
[2020-02-27 08:32:14 +0100] [22399] [CRITICAL] WORKER TIMEOUT (pid:22402)
[2020-02-27 08:32:14 +0100] [22402] [INFO] Worker exiting (pid: 22402)
[2020-02-27 08:32:15 +0100] [22418] [INFO] Booting worker with pid: 22418
[2020-02-27 08:32:16 +0100] [22418] [DEBUG] POST /
[2020-02-27 08:32:16 +0100] [22418] [DEBUG] Ignoring EPIPE
from webvectors.
Андрей, подскажите плиз( Уже несколько дней воюю
from webvectors.
from webvectors.
request works, but its not return data
from webvectors.
@akutuzov thx!!
and one question
this functional not working with this model
2) How insert data without NOUN, VERB and others tags. Make its automatically
from webvectors.
- All the WebVectors functions (including those in the Visualizations, Calculator and Miscellaneous tabs) work with any word embedding model. If something goes wrong for you, please report it with all the details (what do you expect to see, what you actually see, are there any error messages), preferably in a separate issue.
- If you prefer to use embedding models which feature words without PoS tags, then simply download such a model. For example, all our fastText models are trained on corpora without PoS tags. You can find many more models with or without tags in the NLPL Vector Repository.
from webvectors.
from webvectors.
i simply want make https://rusvectores.org/ analogue
from webvectors.
Your error message screenshot is not full (the bottom of the screen is seemingly cropped). Please provide the complete error message.
Aside of that, why your queries are accompanied with this strange _NONE
tag? None of our models contains words with such a tag.
from webvectors.
i simply want make https://rusvectores.org/ analogue
If you really want a full analogue, you will have to install the UDPipe server to perform automatic PoS tagging of user queries.
from webvectors.
which algo file i need download for udpipe?
from webvectors.
UDPipe is a tagger, see https://ufal.mff.cuni.cz/udpipe. You can download the UDPipe models from there, or use our custom model (the link to it can be found in our tutorial).
In general, I highly recommend you to go through the tutorial, it has answers for many of your questions.
from webvectors.
Ok.
2020-02-29 07:11:10,965 : ERROR : Exception on /en/ [POST]
Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 2446, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1951, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1820, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.7/dist-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1949, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.7/dist-packages/flask/app.py", line 1935, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/var/www/webvectors/webvectors.py", line 232, in home
query = process_query(list_data)
File "/var/www/webvectors/webvectors.py", line 153, in process_query
poses = tagword(userquery) # We tag using Stanford CoreNLP
File "/var/www/webvectors/lemmatizer.py", line 43, in tagword
tagged = json.loads(corenlp.decode('utf-8'), strict=False)
File "/usr/lib/python3.7/json/__init__.py", line 361, in loads
return cls(**kw).decode(s)
File "/usr/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
from webvectors.
Use the tag_ud() function, not tag_word().
As can be seen in their respective doctags, tagword()
is for Stanford CoreNLP tagger.
With the UDPipe, one should use tag_ud()
from webvectors.
Можно я в телеграм или куда нить напишу плиз(
from webvectors.
Related Issues (20)
- Add PCA visualizations
- Font color for hovered menu items HOT 1
- Templates do not respect tag configuration HOT 2
- win textfile encoding HOT 2
- Не находит похожие слова HOT 4
- модели cbow_fasttext не загружаются через указанную в туториале функцию HOT 2
- Tunable number of neighbors to retrieve
- Data preprocessing HOT 1
- Inline comments in templates are repeating HOT 1
- Question about udpipe tokenization HOT 1
- Drop Python 2 support HOT 1
- Upper-casing of proper names in visualizations HOT 1
- Necessary update this mapping HOT 1
- Clear the input field on click for all pages HOT 3
- word2vec_server.py not starting on tagger port HOT 2
- Неточность в комменте файле webvectors/preprocessing/rus_preprocessing_udpipe.py HOT 1
- Redirection to "Similar Words" After Clicking on the Words HOT 1
- Tensorflow projector issue
- Can't download geowac_lemmas_none_fasttextskipgram_300_5_2020 following the tutorial HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from webvectors.