myint / language-check Goto Github PK
View Code? Open in Web Editor NEWPython wrapper for LanguageTool grammar checker
Home Page: https://pypi.python.org/pypi/language-check
License: GNU Lesser General Public License v3.0
Python wrapper for LanguageTool grammar checker
Home Page: https://pypi.python.org/pypi/language-check
License: GNU Lesser General Public License v3.0
Using language-check for catching spelling errors with numbers in them would fail.
E.g. d0g
would produce no errors while dogg
would.
Language Tool has the ability to toggle spell checking with numbers off and on (http://wiki.languagetool.org/hunspell-support). It would be nice if langauge-check has this ability also.
Hello,
After I install language-check, and import it in my Jupyter Notebook,
When I run matches = tool.check(text)
, the server is local or from language tool server?
Here is a simple test case to show that LanuageTool.disable_spellcheck()
is not working correctly for Italian. I'm not sure if this is a problem here or in LanguageTool, but I thought I'll start by reporting here:
from language_check import LanguageTool
t = LanguageTool(language='it')
t.disable_spellchecking()
#this contains MORFOLOGIK_RULE_IT:
print(t._spell_checking_rules)
#this unfortunately contains MORFOLOGIK_RULE_IT_IT:
print(t.check('Non le fate piu?'))
from setuptools import setup, find_packages
setup(
name='name',
version='0.0.1',
description=' TA',
long_description=DESCRIPTION,
license='Proprietary License',
author='',
author_email='',
packages=find_packages(),
install_requires=[
"language-check==0.7.1",
"3to2==1.1.1",
],
python setup.py install #Fails
Trying to install language-check via pip is freezing the install process.
Hey, could you test aganst pypy3? We use language-check in coala and we do test against pypy3 (and python 3.5 beta 3 actually which is also nice) and it seems to work so it would be nice if you notice it before us if you break pypy ;)
I am using language-check for few days without an issue. Today, Language-check hangs for 10+ minutes during self.grammarTool = language_check.LanguageTool('en-US')
. I am using language-check-1.0 with python 2.7.13.
Looks like _server_is_alive()
is not working correctly and multiple servers are created on the same port.
I was in the process of packaging language-check for Arch Linux, when I realized the setup script was actually downloading LanguageTool. Since a languagetool
package is already available in the official Arch packages (v3.0), do you think it would be possible to rely on it instead? You can see a list of the files it contains here, so if the setup script can handle an extra argument to find the jar files at runtime, this would be greatly appreciated :)
If 3to2
is not pre-installed, it is not installed for the user during python setup.py install
. It seems kwargs['setup_requires'] = ['3to2']
in setup.py
is not sufficient.
The result can be seen at https://travis-ci.org/jayvdb/language-check/jobs/213723679
Traceback (most recent call last):
File "setup.py", line 595, in <module>
sys.exit(main())
File "setup.py", line 590, in main
run_setup_hooks(config)
File "setup.py", line 562, in run_setup_hooks
default_hook(config)
File "setup.py", line 573, in default_hook
generate_py2k(config)
File "setup.py", line 542, in generate_py2k
run_3to2(copied_py_files)
File "setup.py", line 375, in run_3to2
raise OSError('3to2 script is unavailable.')
OSError: 3to2 script is unavailable.
The same problem occurs in guess_language
:
https://bitbucket.org/spirit/guess_language/issues/18/pip-doesnt-understand-that-guess-language
There @wichert found that the code was almost python 2 compatible with some minor and routine adjustments.
If there is no easy way forward, maybe a better error message could tell the user how to install 3to2
.
setup.py expects that the source has already been converted before the tests are run.
pip install language-check 1
Collecting language-check
Using cached https://files.pythonhosted.org/packages/97/45/0fd1d3683d6129f30fa09143fa383cdf6dff8bc0d1648f2cf156109cb772/language-check-1.1.tar.gz
Building wheels for collected packages: language-check
Running setup.py bdist_wheel for language-check ... error
Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-_W9osa/language-check/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/tmpYOwq7gpip-wheel- --python-tag cp27:
Could not parse Java version from """openjdk version "10.0.1" 2018-04-17
OpenJDK Runtime Environment (build 10.0.1+10-Ubuntu-3ubuntu1)
OpenJDK 64-Bit Server VM (build 10.0.1+10-Ubuntu-3ubuntu1, mixed mode)
""".
----------------------------------------
Failed building wheel for language-check
Running setup.py clean for language-check
Failed to build language-check
Installing collected packages: language-check
Running setup.py install for language-check ... error
Complete output from command /usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-_W9osa/language-check/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-xKEHYl-record/install-record.txt --single-version-externally-managed --compile --user --prefix=:
Could not parse Java version from """openjdk version "10.0.1" 2018-04-17
OpenJDK Runtime Environment (build 10.0.1+10-Ubuntu-3ubuntu1)
OpenJDK 64-Bit Server VM (build 10.0.1+10-Ubuntu-3ubuntu1, mixed mode)
""".
----------------------------------------
Command "/usr/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-_W9osa/language-check/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-xKEHYl-record/install-record.txt --single-version-externally-managed --compile --user --prefix=" failed with error code 1 in /tmp/pip-build-_W9osa/language-check/
I notice that the language tool docs state that (http://wiki.languagetool.org/http-server) the reasno enabled
exists is because if you want to enable only certain rules, you set enabledOnly=True
and give a list of rules to be enabled
(all other rules are disabled)
It doesnt seem to be working in language-check
.
I checked the code and here the enabledOnly
is never being sent (https://github.com/myint/language-check/blob/master/language_check/__init__.py#L259)
So - the --enabled
arg is pretty much useless, as all rules are enabled.
I am testing out for a few grammatical error conditions and not getting the desired results:
Repeated Commas:
Example:
Input: How does it handle multiple commas,,, , , ,.
Output: How does it handle multiple commas,,, , , ,.
Ideally, the output should delete the extra repeated commas
What is the best way to add this additional rule?
I use vim-syntastic/syntastic
plugin in vim with python3 installed and install langugage_check by pip. As syntastic suggest the following setting for TeX
file, such that it view the file as text
and use language_check
as a checker.
let g:syntastic_tex_checkers = ['chktex', 'text/language_check']
It turns out, such setting not work on a tex
file, the error message says:
syntastic: error: checker output:
Traceback (most recent call last):
File "/usr/bin/language-check", line 6, in
sys.exit(main.main())
File "/usr/lib/python3.6/site-packages/language_check/main.py", line 113, in main
remote_server=remote_server,
File "/usr/lib/python3.6/site-packages/language_check/init.py", line 196, in init
self._start_server_on_free_port()
File "/usr/lib/python3.6/site-packages/language_check/init.py", line 333, in start_server_on
free_port
cls._start_local_server()
File "/usr/lib/python3.6/site-packages/language_check/init.py", line 377, in _start_local_serv
er
raise Error(err_msg)
language_check.Error
syntastic: error: checker text/language_check returned abnormal status 1
Can't figure out what's wrong?
Hi, I got some problem when using this tools.
I followed you guild and replaced Language-Tools 3.8 to language-check, then I import language_check successfully. But when I try to call any method in language-check, it just raise the same error:
Here is for languagetools():
>>> tool = language_check.LanguageTool('en-US')
Traceback (most recent call last):
File "/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py", line 529, in get_languages
languages = cache['languages']
KeyError: 'languages'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py", line 310, in _get_root
with urlopen(url, data, cls._TIMEOUT) as f:
File "/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py", line 201, in __init__
self._language = LanguageTag(language)
File "/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py", line 453, in __new__
return str.__new__(cls, cls._normalize(tag))
File "/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py", line 474, in _normalize
for language in get_languages()}
File "/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py", line 531, in get_languages
languages = LanguageTool._get_languages()
File "/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py", line 291, in _get_languages
for e in cls._get_root(url, num_tries=1):
File "/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py", line 317, in _get_root
raise Error('{}: {}'.format(cls._url, e))
language_check.Error: http://127.0.0.1:8081: HTTP Error 400: Bad Request
Here is for get_language():
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py in get_languages()
528 try:
--> 529 languages = cache['languages']
530 except KeyError:
KeyError: 'languages'
During handling of the above exception, another exception occurred:
HTTPError Traceback (most recent call last)
/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py in _get_root(cls, url, data, num_tries)
309 try:
--> 310 with urlopen(url, data, cls._TIMEOUT) as f:
311 return ElementTree.parse(f).getroot()
/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
222 opener = _opener
--> 223 return opener.open(url, data, timeout)
224
/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py in open(self, fullurl, data, timeout)
531 meth = getattr(processor, meth_name)
--> 532 response = meth(req, response)
533
/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py in http_response(self, request, response)
641 response = self.parent.error(
--> 642 'http', request, response, code, msg, hdrs)
643
/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py in error(self, proto, *args)
569 args = (dict, 'default', 'http_error_default') + orig_args
--> 570 return self._call_chain(*args)
571
/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py in _call_chain(self, chain, kind, meth_name, *args)
503 func = getattr(handler, meth_name)
--> 504 result = func(*args)
505 if result is not None:
/home/jie-gd/anaconda3/lib/python3.6/urllib/request.py in http_error_default(self, req, fp, code, msg, hdrs)
649 def http_error_default(self, req, fp, code, msg, hdrs):
--> 650 raise HTTPError(req.full_url, code, msg, hdrs, fp)
651
HTTPError: HTTP Error 400: Bad Request
During handling of the above exception, another exception occurred:
Error Traceback (most recent call last)
<ipython-input-20-7ce33a1ca083> in <module>()
----> 1 tool = language_check.get_languages()
/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py in get_languages()
529 languages = cache['languages']
530 except KeyError:
--> 531 languages = LanguageTool._get_languages()
532 cache['languages'] = languages
533 return languages
/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py in _get_languages(cls)
289 url = urllib.parse.urljoin(cls._url, 'Languages')
290 languages = set()
--> 291 for e in cls._get_root(url, num_tries=1):
292 languages.add(e.get('abbr'))
293 languages.add(e.get('abbrWithVariant'))
/home/jie-gd/anaconda3/lib/python3.6/site-packages/language_check/__init__.py in _get_root(cls, url, data, num_tries)
315 cls._start_server()
316 if n + 1 >= num_tries:
--> 317 raise Error('{}: {}'.format(cls._url, e))
318
319 @classmethod
Config:
I tried to install language-check from pip, and the installation works. But then, when I try using it in some Python code, it throws some error.
So I tried to install language-check directly from the git repo, as suggested in another issue, with:
pip3 install git+https://github.com/myint/language-check.git
But then I get:
Could not parse Java version from """openjdk version "10.0.1" 2018-04-17
OpenJDK Runtime Environment (build 10.0.1+10-Ubuntu-3ubuntu1)
OpenJDK 64-Bit Server VM (build 10.0.1+10-Ubuntu-3ubuntu1, mixed mode
I also tried installing it on Windows, with the following configuration, and in this case it works like a charm:
So I guess that the problem comes from OpenJDK on Ubuntu.
Any advice?
Hello there,
I have used this lib to check for grammar errors in French and it works perfectly fine :
>>> #!/usr/bin/env python
... #-*- coding: utf-8 -*-
...
>>> import language_check
>>> tool = language_check.LanguageTool('fr')
>>> text = u'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy'
>>> matches = tool.check(text)
>>> len(matches)
>>> 1
However, when I switch to another language (english) it fails :
#!/usr/bin/env python
#-*- coding: utf-8 -*-
...
import language_check
tool = language_check.LanguageTool('en-GB')
text = u'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy'
matches = tool.check(text)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/site-packages/language_check/__init__.py", line 246, in check
root = self._get_root(self._url, self._encode(text, srctext))
File "/usr/local/lib/python2.7/site-packages/language_check/__init__.py", line 310, in _get_root
cls._terminate_server()
File "/usr/local/lib/python2.7/site-packages/language_check/__init__.py", line 401, in _terminate_server
cls._server.terminate()
AttributeError: 'NoneType' object has no attribute 'terminate'
I am running it with Python 2.7 on MacOS sierra
#EDIT
Problem solved by restarting my computer
Hi,
Is it possible to configure language-check not to start a local languagetool server, but connect to a remote, already running instance?
I haven't found it in the docs, and I'm not a programmer, so couldn't figure it out from the code.
The idea would be to run a central languagetool server, and my clients would connect to it using language-check (and various languagetool plugins).
Kind Regards,
Robert Fekete
Cfr. vim-syntastic/syntastic/issues/1918.
When using syntastic with language-check on source files, program code is obviously parsed as language, and obviously has a lot of grammatical mistakes.
I know this is a big one, but would it be interesting to add some experimental support to language-check to parse source files? This would be especially useful for LaTeX, as that contains a lot of text. I also think it is useful to parse eg. C, C++ and Java code, and check comments for grammatical and spelling mistakes.
What do you think?
As seen below the LT api has changed, but this client only works for 3.2:
https://languagetool.org/http-api/swagger-ui/#/default
How can I make this work for newer LT 4.4?
When one of our users (at coala) was running the tests, he got the following error:
E
E Traceback information is provided below:
E
E Traceback (most recent call last):
E File "/usr/local/lib/python3.4/dist-packages/language_check-0.8-py3.4.egg/language_check/__init__.py", line 522, in get_languages
E languages = cache['languages']
E KeyError: 'languages'
E
E During handling of the above exception, another exception occurred:
E
E Traceback (most recent call last):
E File "/usr/local/lib/python3.4/dist-packages/language_check-0.8-py3.4.egg/language_check/__init__.py", line 304, in _get_root
E with urlopen(url, data, cls._TIMEOUT) as f:
E File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
E return opener.open(url, data, timeout)
E File "/usr/lib/python3.4/urllib/request.py", line 469, in open
E response = meth(req, response)
E File "/usr/lib/python3.4/urllib/request.py", line 579, in http_response
E 'http', request, response, code, msg, hdrs)
E File "/usr/lib/python3.4/urllib/request.py", line 507, in error
E return self._call_chain(*args)
E File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain
E result = func(*args)
E File "/usr/lib/python3.4/urllib/request.py", line 587, in http_error_default
E raise HTTPError(req.full_url, code, msg, hdrs, fp)
E urllib.error.HTTPError: HTTP Error 403: Forbidden
E
E During handling of the above exception, another exception occurred:
E
E Traceback (most recent call last):
E File "/usr/local/lib/python3.4/dist-packages/coala-0.4.2.dev20160309195000-py3.4.egg/coalib/bears/Bear.py", line 97, in execute
E return list(self.run_bear_from_section(args, kwargs) or [])
E File "/home/vivek/gsoc16/coala-bears/bears/natural_language/LanguageToolBear.py", line 41, in run
E tool = LanguageTool(locale, motherTongue="en_US")
E File "/usr/local/lib/python3.4/dist-packages/language_check-0.8-py3.4.egg/language_check/__init__.py", line 195, in __init__
E self._language = LanguageTag(language)
E File "/usr/local/lib/python3.4/dist-packages/language_check-0.8-py3.4.egg/language_check/__init__.py", line 446, in __new__
E return str.__new__(cls, cls._normalize(tag))
E File "/usr/local/lib/python3.4/dist-packages/language_check-0.8-py3.4.egg/language_check/__init__.py", line 467, in _normalize
E for language in get_languages()}
E File "/usr/local/lib/python3.4/dist-packages/language_check-0.8-py3.4.egg/language_check/__init__.py", line 524, in get_languages
E languages = LanguageTool._get_languages()
E File "/usr/local/lib/python3.4/dist-packages/language_check-0.8-py3.4.egg/language_check/__init__.py", line 285, in _get_languages
E for e in cls._get_root(url, num_tries=1):
E File "/usr/local/lib/python3.4/dist-packages/language_check-0.8-py3.4.egg/language_check/__init__.py", line 310, in _get_root
E raise Error('{}: {}'.format(cls._url, e))
E language_check.Error: http://127.0.0.1:8081: HTTP Error 403: Forbidden
I went through your code and found that you were actually trying multiple ports (https://github.com/myint/language-check/blob/master/language_check/__init__.py#L177) so this couldn't be an issue that he had something else running on that port.
Could you help us in figuring out why this could happen ?
And what we could do to prevent it ?
PS: Ignore the E
at the front. It is from the pytest
utility when running tests
The LanguageTool wiki describes how to use n-gram data to detect additional error types, and provides n-gram data for this purpose. Is it possible to support this in language-check
?
I don't see anything in language-check
about --languagemodel
or --config file
options (I've mostly been looking in __init__.py
). I don't know enough about how language-check
wraps the Java application to suggest a solution, nor whether supporting this would complicate other things (e.g., if using the n-gram directories, does this no longer check for other errors?), so apologies if I've posted an untenable request.
Hey,
we'd love to use language-check to get LanguageTool into coala. See https://groups.google.com/d/msg/coala-devel/JU7dGspgMv4/l7fIJrLNQgQJ fore more info.
You already do travis CI, I would love it if you could validate your software also against windows using a service like AppVeyor. We just did that with the coala project (https://github.com/coala-analyzer/coala/ ) and I'd be happy to help through suggestions although I don't have the time to do it for you.
Hey, apparently you did a release a few days ago. We've got a few issues since then.
On linux:
Building wheels for collected packages: language-check
Running setup.py bdist_wheel for language-check
Complete output from command /usr/bin/python3 -c "import setuptools;__file__='/tmp/pip-build-t4e86xil/language-check/setup.py';exec(compile(open(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" bdist_wheel -d /tmp/tmp006l18b7pip-wheel-:
Could not parse Java version from """openjdk version "1.8.0_60"
OpenJDK Runtime Environment (build 1.8.0_60-b27)
OpenJDK 64-Bit Server VM (build 25.60-b23, mixed mode)
""".
Several of our devs are getting this. I have it on fedora and Fabian uses openSuse I guess (no guarantees on that.)
It also breaks our builds on windows with a cryptic error: https://ci.appveyor.com/project/sils1297/coala/build/1.0.1932/job/ytt1eoa56278gi94#L73
Hi @myint :) Greetings from https://github.com/coala/coala
Have you recently already seen something like https://travis-ci.org/coala/coala/jobs/231106820#L1027 ?
Collecting language-check~=1.0; extra == "alldeps" (from coala-bears[alldeps])
Exception:
Traceback (most recent call last):
File "/home/travis/virtualenv/python3.6.1/lib/python3.6/site-packages/pip/basecommand.py", line 215, in main
status = self.run(options, args)
File "/home/travis/virtualenv/python3.6.1/lib/python3.6/site-packages/pip/commands/install.py", line 335, in run
wb.build(autobuilding=True)
File "/home/travis/virtualenv/python3.6.1/lib/python3.6/site-packages/pip/wheel.py", line 749, in build
self.requirement_set.prepare_files(self.finder)
File "/home/travis/virtualenv/python3.6.1/lib/python3.6/site-packages/pip/req/req_set.py", line 380, in prepare_files
ignore_dependencies=self.ignore_dependencies))
File "/home/travis/virtualenv/python3.6.1/lib/python3.6/site-packages/pip/req/req_set.py", line 620, in _prepare_file
session=self.session, hashes=hashes)
File "/home/travis/virtualenv/python3.6.1/lib/python3.6/site-packages/pip/download.py", line 809, in unpack_url
unpack_file_url(link, location, download_dir, hashes=hashes)
File "/home/travis/virtualenv/python3.6.1/lib/python3.6/site-packages/pip/download.py", line 715, in unpack_file_url
unpack_file(from_path, location, content_type, link)
File "/home/travis/virtualenv/python3.6.1/lib/python3.6/site-packages/pip/utils/__init__.py", line 599, in unpack_file
flatten=not filename.endswith('.whl')
File "/home/travis/virtualenv/python3.6.1/lib/python3.6/site-packages/pip/utils/__init__.py", line 484, in unzip_file
zip = zipfile.ZipFile(zipfp, allowZip64=True)
File "/opt/python/3.6.1/lib/python3.6/zipfile.py", line 1100, in __init__
self._RealGetContents()
File "/opt/python/3.6.1/lib/python3.6/zipfile.py", line 1168, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
We get in all of our latest PRs :/
Error message:
error in language-check setup command: package_data must be a dictionary mapping package names to lists of wildcard patterns
LanguageTool as automatic language detection, could you expose that?
Hello, when I try to reproduce the example usage in the documentation (using Python 2.7.10 via the Anaconda distribution) I encounter an encoding error:
>>> import language_check
>>> tool = language_check.LanguageTool('en-US')
>>> text = 'A sentence with a error in the Hitchhiker’s Guide tot he Galaxy'
>>> matches = tool.check(text)
Traceback (most recent call last):
File "", line 1, in
File "/afs/crc.nd.edu/user/d/dduhaime/anaconda/lib/python2.7/site-packages/language_check-0.7.2-py2.7.egg/language_check/__init__.py", line 243, in check
root = self._get_root(self._url, self._encode(text, srctext))
File "/afs/crc.nd.edu/user/d/dduhaime/anaconda/lib/python2.7/site-packages/language_check-0.7.2-py2.7.egg/language_check/__init__.py", line 253, in _encode
params = {u'language': self.language, u'text': text.encode(u'utf-8')}
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 41: ordinal not in range(128)
Does anyone have thoughts on what might be going on?
Is there a way to provide the language tool with a list of words that should NOT be marked as mistakes?
I have a lot of technical terms in my data that are wrongly corrected when automatically applying the suggestions of the language tool.
python 3.4
java 9 64-bit
Code I Ran:
import language_check
tool = language_check.LanguageTool( 'en-US' )
Running in console or in sublime text, I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\conno\OneDrive\Desktop\language-check-master\language_check\__init__.py", line 196, in __init__
self._start_server_on_free_port()
File "C:\Users\conno\OneDrive\Desktop\language-check-master\language_check\__init__.py", line 333, in _start_server_on_free_port
cls._start_local_server()
File "C:\Users\conno\OneDrive\Desktop\language-check-master\language_check\__init__.py", line 377, in _start_local_server
raise Error(err_msg)
language_check.Error: Exception in thread "main" java.lang.NoClassDefFoundError: javax/xml/bind/JAXBException
at net.loomchild.segment.srx.io.Srx2SaxParser.<init>(Srx2SaxParser.java:173)
at org.languagetool.tokenizers.SrxTools.createSrxDocument(SrxTools.java:51)
at org.languagetool.tokenizers.SRXSentenceTokenizer.<init>(SRXSentenceTokenizer.java:53)
at org.languagetool.tokenizers.SimpleSentenceTokenizer.<init>(SimpleSentenceTokenizer.java:37)
at org.languagetool.Language.<clinit>(Language.java:60)
at java.base/java.lang.Class.forName0(Native Method)
at java.base/java.lang.Class.forName(Unknown Source)
at org.languagetool.Languages.createLanguageObjects(Languages.java:111)
at org.languagetool.Languages.getAllLanguages(Languages.java:97)
at org.languagetool.Languages.<clinit>(Languages.java:39)
at org.languagetool.language.LanguageIdentifier.getLanguageCodes(LanguageIdentifier.java:77)
at org.languagetool.language.LanguageIdentifier.<init>(LanguageIdentifier.java:64)
at org.languagetool.server.LanguageToolHttpHandler.<init>(LanguageToolHttpHandler.java:85)
at org.languagetool.server.HTTPServer.<init>(HTTPServer.java:99)
at org.languagetool.server.HTTPServer.main(HTTPServer.java:145)
Caused by: java.lang.ClassNotFoundException: javax.xml.bind.JAXBException
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(Unknown Source)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(Unknown Source)
at java.base/java.lang.ClassLoader.loadClass(Unknown Source)
... 15 more
Hi,
I was trying to install language-check in docker and I got this error:
Collecting language-check
Using cached language-check-0.7.1.tar.gz
Complete output from command python setup.py egg_info:
Traceback (most recent call last):
File "<string>", line 20, in <module>
File "/tmp/pip-build-cqb7myny/language-check/setup.py", line 597, in <module>
sys.exit(main())
File "/tmp/pip-build-cqb7myny/language-check/setup.py", line 593, in main
setup(**cfg_to_args(config))
File "/tmp/pip-build-cqb7myny/language-check/setup.py", line 355, in cfg_to_args
kwargs['version'] = get_version()
File "/tmp/pip-build-cqb7myny/language-check/setup.py", line 131, in get_version
for line in input_file:
File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 27: ordinal not in range(128)
----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-cqb7myny/language-check
This can be fixed by changing the following in setup.py
def get_version():
.............
with open('language_check/__init__.py') as input_file:
.....
to:
import codecs
def get_version():
.............
with codecs.open('language_check/__init__.py', 'r', 'utf-8') as input_file:
.....
There are lint errors occurring in Travis builds.
Please note, this is a task reserved for Google Code-in students.
Lots of additional rules, several API changes, requires Java 7.
Our build system Travis has problems to install language-check
:
----------------------------------------
Failed building wheel for language-check
Running setup.py clean for language-check
Failed to build language-check
Installing collected packages: language-check, munkres3, mypy-lang, jsonschema, decorator, ipython-genutils, enum34, traitlets, jupyter-core, nbformat, nltk, click, future, proselint, pycodestyle, pydocstyle, lazy-object-proxy, wrapt, astroid, mccabe, pylint, mando, colorama, radon, restructuredtext-lint, rstcheck, pyparsing, packaging, safety, scspell3k, pathlib, chardet, ansicolor, vim-vint, vulture, yamllint, yapf
Running setup.py install for language-check ... error
Complete output from command /home/travis/virtualenv/python3.3.6/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-x7h2r3/language-check/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-u9p7qy-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/travis/virtualenv/python3.3.6/include/site/python3.3/language-check:
Could not parse Java version from """java version "9-ea"
Java(TM) SE Runtime Environment (build 9-ea+140)
Java HotSpot(TM) 64-Bit Server VM (build 9-ea+140, mixed mode)
""".
----------------------------------------
Command "/home/travis/virtualenv/python3.3.6/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-x7h2r3/language-check/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-u9p7qy-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/travis/virtualenv/python3.3.6/include/site/python3.3/language-check" failed with error code 1 in /tmp/pip-build-x7h2r3/language-check/
As the installation can take quite a while, it would be useful to emit some progress indicators during the process so that CI doesnt fail because it hasnt seen any output.
coala is seeing lots of timeout failures due to this problem: https://circleci.com/gh/jayvdb/coala-bears
(But this is aggravated by a bug in our Circle CI (coala/coala-bears#1863) , but fixing that doesnt entirely resolve this issue.)
Hey, this lacks a bit of documentation a bit. Especially the match class could expose a few more public methods and have at least in code documentation so one can easily do something else than printing with matches.
Hi, I'm on RHEL running the Anaconda distribution of Python 2.7.9. I run pip install 3to2, then copy the language_check source, cd into the source, and run python2 setup.py install
. Everything installs correctly, but upon import I get:
>>> import language_check
Traceback (most recent call last):
File "", line 1, in
File "language_check/__init__.py", line 238
def check(self, text: str, srctext=None) -> [Match]:
^
SyntaxError: invalid syntax
Hey, seems language-check has something against java version 1.8? You claim your setup takes care of gettting the right java stuff. Here's what we're experiencing:
https://ci.appveyor.com/project/sils1297/coala/build/1.0.1079/job/uwhqxai01o767g6i#L174
New to the code set and initial download indicates it is using 2.7. I am on Ubuntu 15.04. TIA.
I don't understand, how I can ignore English words, if I use Russian language by default.
It feature by default, if I run LanguageTool, use:
I write texts in Russian, where can be many words from English. Now language-check check English words as errors.
My file SashaExample.txt
:
Sasha Belissimo!
Саша Совершенна!
My file eric_languagetool.py
:
from eric_config import all_txt_in_eric_room_wihtout_subfolders
from eric_config import log
import language_check
import os
tool_language = language_check.LanguageTool('ru-RU')
failure_tests = False
for filename in all_txt_in_eric_room_wihtout_subfolders:
filename_without_path = os.path.basename(filename)
log.debug(filename_without_path + "\n")
file_text = open(filename_without_path).read()
error_list = tool_language.check(file_text)
print(*error_list, sep='\n\n')
if not error_list:
log.debug(
"Not detect errors and typos in" +
filename_without_path +
"\n\n")
else:
log.warning(
"Detect error(s) or/and typo(s) in " + filename_without_path + "\n\n")
failure_tests = True
if not failure_tests:
log.notice("LanguageTool no detect errors and typos for all files.")
if failure_tests:
log.warning(
"LanguageTool detect error(s) or/and typo(s). Please, review it.")
If I run in console:
D:\SashaPythonista>java -jar "D:/Chocolatey/lib/languagetool/tools/LanguageTool-3.6/languagetool.jar" SashaExample.txt
No errors:
Also, I can't errors, if I can use Sublime Text LanguageTool package.
I run eric_languagetool.py
for SashaExample.txt
:
D:\SashaPythonista>language-check --heelp
'language-check' is not recognized as an internal or external command,
operable program or batch file.
D:\SashaPythonista>python "tests/eric_languagetool.py"
Line 1, column 1, Rule ID: MORFOLOGIK_RULE_RU_RU
Message: Найдена орфографическая ошибка
Sasha Belissimo! Саша Совершенна!
^^^^^
Line 1, column 7, Rule ID: MORFOLOGIK_RULE_RU_RU
Message: Найдена орфографическая ошибка
Sasha Belissimo! Саша Совершенна!
^^^^^^^^^
The same in Interpreter:
>>> import language_check
>>> tool_language = language_check.LanguageTool('ru-RU')
>>> file_text = u'Sasha Belissimo! Саша Совершенна!'
>>> error_list = tool_language.check(file_text)
>>> print(*error_list, sep='\n\n')
Line 1, column 1, Rule ID: MORFOLOGIK_RULE_RU_RU
Message: Найдена орфографическая ошибка
Sasha Belissimo! Саша Совершенна!
^^^^^
Line 1, column 7, Rule ID: MORFOLOGIK_RULE_RU_RU
Message: Найдена орфографическая ошибка
Sasha Belissimo! Саша Совершенна!
^^^^^^^^^
>>>
English words in Russian texts check as errors.
I don't find, how I can solve this problem, in:
Thanks.
wangx@wangx-PC:~$ sudo pip3 install --upgrade language-check
The directory '/home/wangx/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/wangx/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting language-check
Downloading language-check-0.8.tar.gz
Installing collected packages: language-check
Running setup.py install for language-check ... error
Complete output from command /usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-build-qkc8rg5y/language-check/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-42ihuhou-record/install-record.txt --single-version-externally-managed --compile:
Traceback (most recent call last):
File "/usr/lib/python3.5/urllib/request.py", line 1243, in do_open
h.request(req.get_method(), req.selector, req.data, headers)
File "/usr/lib/python3.5/http/client.py", line 1106, in request
self._send_request(method, url, body, headers)
File "/usr/lib/python3.5/http/client.py", line 1151, in _send_request
self.endheaders(body)
File "/usr/lib/python3.5/http/client.py", line 1102, in endheaders
self._send_output(message_body)
File "/usr/lib/python3.5/http/client.py", line 934, in _send_output
self.send(msg)
File "/usr/lib/python3.5/http/client.py", line 877, in send
self.connect()
File "/usr/lib/python3.5/http/client.py", line 1252, in connect
super().connect()
File "/usr/lib/python3.5/http/client.py", line 849, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/usr/lib/python3.5/socket.py", line 693, in create_connection
for res in getaddrinfo(host, port, 0, SOCK_STREAM):
File "/usr/lib/python3.5/socket.py", line 732, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -3] Temporary failure in name resolution
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-build-qkc8rg5y/language-check/setup.py", line 597, in <module>
sys.exit(main())
File "/tmp/pip-build-qkc8rg5y/language-check/setup.py", line 592, in main
run_setup_hooks(config)
File "/tmp/pip-build-qkc8rg5y/language-check/setup.py", line 561, in run_setup_hooks
language_tool_hook(config)
File "/tmp/pip-build-qkc8rg5y/language-check/setup.py", line 586, in language_tool_hook
download_lt()
File "/tmp/pip-build-qkc8rg5y/language-check/download_lt.py", line 128, in download_lt
with closing(urlopen(url)) as u:
File "/usr/lib/python3.5/urllib/request.py", line 162, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.5/urllib/request.py", line 465, in open
response = self._open(req, data)
File "/usr/lib/python3.5/urllib/request.py", line 483, in _open
'_open', req)
File "/usr/lib/python3.5/urllib/request.py", line 443, in _call_chain
result = func(*args)
File "/usr/lib/python3.5/urllib/request.py", line 1286, in https_open
context=self._context, check_hostname=self._check_hostname)
File "/usr/lib/python3.5/urllib/request.py", line 1245, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno -3] Temporary failure in name resolution>
----------------------------------------
Command "/usr/bin/python3 -u -c "import setuptools, tokenize;file='/tmp/pip-build-qkc8rg5y/language-check/setup.py';exec(compile(getattr(tokenize, 'open', open)(file).read().replace('\r\n', '\n'), file, 'exec'))" install --record /tmp/pip-42ihuhou-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-qkc8rg5y/language-check/
wangx@wangx-PC:~$
The way class method _server_is_alive is written it only checks for the local server. When LanguageTool instance is created with a remote server _server_is_alive fails to check the remote server and starts a local server.
I've not used language-check before, so I may have missed some document explaining install, but I'm seeing this on Python 2.7, pulling from HEAD of master, when I try to run setup.py or do import language_check{code}
>>> import language_check
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "language_check\__init__.py", line 241
def check(self, text: str, srctext=None) -> [Match]:
^
SyntaxError: invalid syntax
Hello. I am using language-check. I have huge amount of documents to pass through language-tool in order to get their mistakes back.
It's basically a simple loop over all docs, calling your librairy.
for doc in docs:
LanguageTool.check(doc)
The start is going smoothly but after a while, performances are dropping.
Here are the symptoms : Checking with "top" linux command, language_tool is taking really small amount of CPU or RAM so I am assuming he is like frozen. Monitoring with wireshark I do not receive answer from my requests. When I am stopping my programm, language-tool is sending back load of http responses with the mistakes corresponding to my docs.
I tried to debug a little and here are my thoughts : might it comes from the language-check library not waiting for the requests answers, then my loop would flood language-tool with requests, which can not process anything anymore, except incoming request.
I assume this part of code is doing the call, especially the urlopen.
@classmethod
def _get_root(cls, url, data=None, num_tries=2):
for n in xrange(num_tries):
try:
with urlopen(url, data, cls._TIMEOUT) as f:
return ElementTree.parse(f).getroot()
except (IOError, httplib.HTTPException), e:
if n + 1 < num_tries:
cls._start_server()
else:
raise Error(u'{}: {}'.format(cls._url, e))
Using "pip install" the language-check, the default LanguageTool is V3.2. When I try to replace the old version by LanguageTool-3.7, an error message occurs. How to use the latest version of LanguageTool? Thanks.
I am testing out for a few grammatical error conditions and not getting the desired results:
Missing space between full stop and first word of the next sentence:
Example:
Input: capitalizes.What does it do for missing spaces
Output: capitalizes.What does it do for missing spaces
Ideally, the output should have a space between the full stop and "What"
What is the best way to add this additional rule?
LanguageTool server keeps on freezing randomly. The client never receives the HTTP request.
This is the state of the client at the time of frozen server:
Traceback (most recent call last):
File "langtool.py", line 25, in <module>
langtool.check(txt)
File "/home/seb/.virtualenvs/p3/lib/python3.4/site-packages/language_check/__init__.py", line 240, in check
root = self._get_root(self._url, self._encode(text, srctext))
File "/home/seb/.virtualenvs/p3/lib/python3.4/site-packages/language_check/__init__.py", line 299, in _get_root
with urlopen(url, data, cls._TIMEOUT) as f:
File "/usr/lib/python3.4/urllib/request.py", line 161, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib/python3.4/urllib/request.py", line 463, in open
response = self._open(req, data)
File "/usr/lib/python3.4/urllib/request.py", line 481, in _open
'_open', req)
File "/usr/lib/python3.4/urllib/request.py", line 441, in _call_chain
result = func(*args)
File "/usr/lib/python3.4/urllib/request.py", line 1210, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/lib/python3.4/urllib/request.py", line 1185, in do_open
r = h.getresponse()
File "/usr/lib/python3.4/http/client.py", line 1171, in getresponse
response.begin()
File "/usr/lib/python3.4/http/client.py", line 351, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.4/http/client.py", line 313, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.4/socket.py", line 374, in readinto
return self._sock.recv_into(b)
KeyboardInterrupt
This is the error triggering sample script:
import language_check
langtool = language_check.LanguageTool("en-US")
txt = """
Of an intermediate balance, under the circumstances, there is no
possibility. The city has its cunning wiles, no less than the
infinitely smaller and more human tempter. There are large forces
which allure with all the soulfulness of expression possible in the
most cultured human. The gleam of a thousand lights is often as
effective as the persuasive light in a wooing and fascinating eye.
Half the undoing of the unsophisticated and natural mind is
accomplished by forces wholly superhuman. A blare of sound, a roar
of life, a vast array of human hives, appeal to the astonished
senses in equivocal terms. Without a counsellor at hand to whisper
cautious interpretations, what falsehoods may not these things breathe
into the unguarded ear! Unrecognised for what they are, their
beauty, like music, too often relaxes, then weakens, then perverts the
simpler human perceptions.
"""
for i in range(100000):
print("\r{}".format(i), flush=True, end="")
langtool.check(txt)
It normally freezes at around 900th-1000th iteration.
OS: Linux Mint
Kernel: 3.13.0-24-generic
Python: 3.4.3
java -version:
java version "1.8.0_60"
Java(TM) SE Runtime Environment (build 1.8.0_60-b27)
Java HotSpot(TM) 64-Bit Server VM (build 25.60-b23, mixed mode)
Is there a new Dockerfile available? This one runs with Java 8.
I am processing a set of documents, at first okay, but then, at a certain moment happens that error, it is as if the request was blocked ..
Traceback (most recent call last):
File "/home/rodriguesfas/.vscode/extensions/ms-python.python-2018.9.2/pythonFiles/experimental/ptvsd_launcher.py", line 118, in
vspd.debug(filename, port_num, '', '', run_as)
File "/home/rodriguesfas/.vscode/extensions/ms-python.python-2018.9.2/pythonFiles/experimental/ptvsd/ptvsd/debugger.py", line 37, in debug
run(address, filename, *args, **kwargs)
File "/home/rodriguesfas/.vscode/extensions/ms-python.python-2018.9.2/pythonFiles/experimental/ptvsd/ptvsd/_local.py", line 79, in run_file
run(argv, addr, **kwargs)
File "/home/rodriguesfas/.vscode/extensions/ms-python.python-2018.9.2/pythonFiles/experimental/ptvsd/ptvsd/_local.py", line 140, in _run
_pydevd.main()
File "/home/rodriguesfas/.vscode/extensions/ms-python.python-2018.9.2/pythonFiles/experimental/ptvsd/ptvsd/_vendored/pydevd/pydevd.py", line 1760, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/home/rodriguesfas/.vscode/extensions/ms-python.python-2018.9.2/pythonFiles/experimental/ptvsd/ptvsd/_vendored/pydevd/pydevd.py", line 1107, in run
return self._exec(is_module, entry_point_fn, module_name, file, globals, locals)
File "/home/rodriguesfas/.vscode/extensions/ms-python.python-2018.9.2/pythonFiles/experimental/ptvsd/ptvsd/_vendored/pydevd/pydevd.py", line 1114, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/rodriguesfas/Mestrado/workspace/specana.prototype/script/test_language_check.py", line 58, in
check_sentence(sentence)
File "/home/rodriguesfas/Mestrado/workspace/specana.prototype/script/test_language_check.py", line 36, in check_sentence
matches = tool.check(sentence)
File "/home/rodriguesfas/.local/lib/python2.7/site-packages/grammar_check/init.py", line 246, in check
root = self._get_root(self._url, self._encode(text, srctext))
File "/home/rodriguesfas/.local/lib/python2.7/site-packages/grammar_check/init.py", line 309, in _get_root
cls._start_server()
File "/home/rodriguesfas/.local/lib/python2.7/site-packages/grammar_check/init.py", line 378, in _start_server
raise ServerError(u'{}: {}'.format(cls._url, e))
grammar_check.ServerError: http://127.0.0.1:8081: [Errno 104] Connection reset by peer
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.