Giter Club home page Giter Club logo

gecco's Introduction

Language Machines Badge Codacy Badge Project Status: Unsupported โ€“ The project has reached a stable, usable state but the author(s) have ceased all work on it. A new maintainer may be desired. DOI

======================================================================== GECCO - Generic Environment for Context-Aware Correction of Orthography

by Maarten van Gompel
Centre for Language and Speech Technology, Radboud University Nijmegen
Sponsored by Revisely (http://revise.ly)
Licensed under the GNU Public License v3

Gecco is a generic modular and distributed framework for spelling correction. Aimed to build a complete context-aware spelling correction system given your own data set. Most modules will be language-independent and trainable from a source corpus. Training is explicitly included in the framework. The framework aims to easily extendible, modules can be written in Python 3. Moreover, the framework is scalable and can be distributed over multiple servers.

Given an input text, Gecco will add various suggestions for correction.

The system can be invoked from the command-line, as a Python binding, as a RESTful webservice, or through the web application (two interfaces).

Modules:

  • Generic built-in modules:
    • Confusible Module
      • A confusible module is able to discern which version of often confused word is correct given the context. For example, the words "then" and "than" are commonly confused in English.
      • Your configuration should specify between which confusibles the module disambiguates.
      • The module is implemented using the IGTree classifier (a k-Nearest Neighbour approximation) in Timbl.
    • Suffix Confusible Module
      • A variant of the confusible module that checks commonly confused morphological suffixes, rather than words.
      • Your configuration should specify between which suffixes the module disambiguates
      • The module is implemented using the IGTree classifier (a k-Nearest Neighbour approximation) in Timbl.
    • Language Model Module
      • A language model predicts what words are likely to follow others, similar to predictive typing applications commonly found on smartphones.
      • The module is implemented using the IGTree classifier (a k-Nearest Neighbour approximation) in Timbl.
    • Aspell Module
      • Aspell is open-source lexicon-based software for spelling correction. This module enables aspell to be used from gecco. This is not a context-sensitive method.
    • Hunspell Module
      • Hunspell is open-source lexicon-based software for spelling correction. This module enables hunspell to be used from gecco. This is not a context-sensitive method.
    • Lexicon Module
      • The lexicon module enables you to automatically generate a lexicon from corpus data and use it. This is not a context-sensitive method.
      • Typed words are matched against the lexicon and the module will come with suggestions within a certain Levenshtein distance.
    • Errorlist Module
      • The errorlist module is a very simple module that checks whether a word is in a known error list, and if so, provides the suggestions from that list. This is not a context-sensitive method.
    • Split Module
      • The split module detects words that are split but should be written together.
      • Implemented using Colibri Core
    • Runon Module
      • The runon module detects words that are written as one but should be split.
      • Implemented using Colibri Core
    • Punctuation & Recase Module
      • The punctuation & recase module attempts to detect missing punctuation, superfluous punctuation, and missing capitals.
      • The module is implemented using the IGTree classifier (a k-Nearest Neighbour approximation) in Timbl.
  • Modules suggested but not implemented yet:
    • Language Detection Module
      • (Not written yet, option for later)
    • Sound-alike Module
      • (Not written yet, option for later)

Features

  • Easily extendible by adding modules using the gecco module API
  • Language independent
  • Built-in training pipeline (given corpus input): Create models from sources
  • Built-in testing pipeline (given an error-annotated test corpus), returns report of evaluation metrics per module
  • Distributed, Multithreaded & Scalable:
    • Load balancing: backend servers can run on multiple hosts, master process distributes amongst these
    • Multithreaded, modules can be invoked in parallel, module servers themselves may be multithreaded too
  • Input and output is FoLiA XML (http://proycon.github.io/folia)
    • Automatic input conversion from plain text using ucto

Gecco is the successor of Valkuil.net and Fowlt.net.


Installation

Gecco relies on a large number of dependencies, including but not limited to:

Dependencies:

To install Gecco, we strongly recommend you to use our LaMachine distribution, which can be obtained from https://github.com/proycon/lamachine .

LaMachine includes Gecco and can be run in multiple ways: as a virtual machine, as a docker app, or as a compilation script setting up a Python virtual environment.

Gecco uses memory-based technologies, and depending on the models you train, may take up considerable memory. Therefore we recommend at least 16GB RAM, training may require even more. For various modules, model size may be reduced by increasing frequency thresholds, but this will come at the cost of reduced accuracy.

Gecco will only run on POSIX-complaint operating systems (i.e. Linux, BSD, Mac OS X), not on Windows.


Configuration

To build an actual spelling correction system, you need to have corpus sources and create a gecco configuration that enable the modules you desire with the parameters you want.

A Gecco system consists of a configuration, either in the form of a simple Python script or an external YAML configuration file.

Example YAML configuration:

name: fowlt
path: /path/to/fowlt
language: en
modules:
    - module: gecco.modules.confusibles.TIMBLWordConfusibleModule
      id: confusibles
      source:
        - train.txt
      model:
        - confusible.model
      confusibles: [then,than]

To list all available modules and the parameters they may take, run gecco --helpmodules.

Alternatively, the configuration can be done in Python directly, in which case the script will be the tool that exposes all functionality:

from gecco import Corrector
from gecco.modules.confusibles import TIMBLWordConfusibleModule

corrector = Corrector(id="fowlt", root="/path/to/fowlt/")
corrector.append( TIMBLWordConfusibleModule("thenthan", source="train.txt",test_crossvalidate=True,test=0.1,tune=0.1,model="confusibles.model", confusible=('then','than')))
corrector.append( TIMBLWordConfusibleModule("its", source="train.txt",test_crossvalidate=True,test=0.1,tune=0.1,model="confusibles.model", confusible=('its',"it's")))
corrector.append( TIMBLWordConfusibleModule("errorlist", source="errorlist.txt",model="errorlist.model", servers=[("blah",1234),("blah2",1234)]  )
corrector.append( TIMBLWordConfusibleModule("lexicon", source=["lexicon.txt","lexicon2.txt"],model=["lexicon.model","lexicon2.model"], servers=[("blah",1235)]  )
corrector.main()

It is recommended to adopt a file/directory structure as described below. If you plan on using multiple hosts, you should store it on a shared network drive so all hosts can access the models:

  • yourconfiguration.yml
  • sources/
  • models/

An example system spelling correction system for English is provided with Gecco and resides in the example/ directory.


Server setup

gecco <yourconfig.yml> run <input.folia.xml> is executed to process a given FoLiA document or plaintext document, it starts a master process that will invoke all the modules, which may be distributed over multiple servers. If multiple server instances of the same module are available, the load will be distributed over them. Output will be delivered in the FoLiA XML format and will contain suggestions for correction.

To start module servers on a host, issue gecco <yourconfig.yml> startservers. You can optionally specify which servers you want to start, if you do not want to start all. You can start servers multiple times, either on the same or on multiple hosts. The master process will distribute the load amongst all servers.

To stop the servers, run gecco <yourconfig.yml> stopservers on each host that has servers running. A list of all running servers can be obtained by gecco <yourconfig.yml> listservers.

Modules can also run locally within the master process rather than as servers, this is done by either by adding local: true in the configuration, or by adding the --local option when starting a run. But this will have a significant negative impact on performance and should therefore be avoided.


Architecture

Gecco Architecture


Command line usage

Invoke all gecco functionality through a single command line tool

$ gecco myconfig.yml [subcommand]

or

$ myspellingcorrector.py [subcommand]

Syntax:

usage: gecco [-h]
            {run,startservers,stopservers,startserver,train,evaluate,reset}
            ...

Gecco is a generic, scalable and modular spelling correction framework

Commands:
{run,startservers,stopservers,startserver,train,evaluate,reset}
    run                 Run the spelling corrector on the specified input file
    startservers        Starts all the module servers that are configured to
                        run on the current host. Issue once for each host.
    stopservers         Stops all the module servers that are configured to
                        run on the current host. Issue once for each host.
    listservers         Lists all the module servers on all hosts
    startserver         Start one module's server on the specified port, use
                        'startservers' instead
    train               Train modules
    evaluate            Runs the spelling corrector on input data and compares
                        it to reference data, produces an evaluation report
    reset               Reset modules, deletes all trained models that have
                        sources. Issue prior to train if you want to start
                        anew.

Vital documentation regarding all modules and the settings they take can be obtained through:

$ gecco --helpmodules

Gecco as a webservice

RESTUL webservice access will be available through CLAM. We are still working on better integration of this in Gecco. FOr now, an example implementation of this can be seen here: https://github.com/proycon/valkuil-gecco/tree/master/valkuilwebservice


Gecco as a web-application

A web-application will eventually be available, modelled after Valkuil.net/Fowlt.net.

gecco's People

Contributors

proycon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

gecco's Issues

Unpredictable stability issue in python-timbl, lm server segfaults sometimes

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x2aab14602700 (LWP 36848)]
0x00002aaab3b839e4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
(gdb) bt
#0 0x00002aaab3b839e4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1 0x00002aaab3b83cc1 in std::_Rb_tree_insert_and_rebalance(bool, std::Rb_tree_node_base, std::Rb_tree_node_base, std::_Rb_tree_node_base&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00002aaab7273f8f in M_insert (__v=..., __p=0x2aaaf2a39d40, __x=, this=0xec0960) at /usr/include/c++/4.8/bits/stl_tree.h:1025
#3 M_insert_unique (__v=..., __position=..., this=0xec0960) at /usr/include/c++/4.8/bits/stl_tree.h:1482
#4 insert (__x=..., __position=..., this=0xec0960) at /usr/include/c++/4.8/bits/stl_map.h:648
#5 operator[](__k=, this=0xec0960) at /usr/include/c++/4.8/bits/stl_map.h:469
#6 TimblApiWrapper::classify3safe (this=0xec0950, line=..., normalize=, requireddepth=1 '\001') at src/timblapi.cc:122
#7 0x00002aaab72782e9 in invoke<boost::python::to_python_value<boost::python::tuple const&>, boost::python::tuple (TimblApiWrapper::*)(std::basic_string const&, bool, unsigned char), boost::python::arg_from_python<TimblApiWrapper&>, boost::python::arg_from_pythonstd::basic_string<char const&>, boost::python::arg_from_python, boost::python::arg_from_python > (ac2=..., ac1=..., ac0=..., tc=, f=

@0xe5d128: (boost::python::tuple (TimblApiWrapper::*)(TimblApiWrapper * const, const std::basic_string<char, std::char_traits<char>, std::allocator<char> > &, bool, unsigned char)) 0x2aaab7273ac0 <TimblApiWrapper::classify3safe(std::string const&, bool, unsigned char)>, rc=...) at /usr/include/boost/python/detail/invoke.hpp:88

#8 operator() (args_=, this=0xe5d128) at /usr/include/boost/python/detail/caller.hpp:223
#9 boost::python::objects::caller_py_function_impl<boost::python::detail::caller<boost::python::tuple (TimblApiWrapper::)(std::string const&, bool, unsigned char), boost::python::default_call_policies, boost::mpl::vector5<boost::python::tuple, TimblApiWrapper&, std::string const&, bool, unsigned char> > >::operator() (this=0xe5d120, args=, kw=) at /usr/include/boost/python/object/py_function.hpp:38
#10 0x00002aaab776813a in boost::python::objects::function::call(object, object) const () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#11 0x00002aaab77684a8 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#12 0x00002aaab7772743 in boost::python::handle_exception_impl(boost::function0) () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#13 0x00002aaab7766db3 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#14 0x000000000043810a in PyObject_Call ()
#15 0x0000000000579f45 in PyEval_EvalFrameEx ()
#16 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#17 0x000000000057bfaa in PyEval_EvalFrameEx ()
#18 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#19 0x000000000057bfaa in PyEval_EvalFrameEx ()
#20 0x000000000057c0db in PyEval_EvalFrameEx ()
#21 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#22 0x000000000057df80 in ?? ()
#23 0x000000000043810a in PyObject_Call ()
#24 0x00000000004d3745 in ?? ()
#25 0x000000000043810a in PyObject_Call ()
#26 0x00000000004ab81b in ?? ()
#27 0x000000000047a7aa in ?? ()
#28 0x000000000043810a in PyObject_Call ()
#29 0x0000000000579f45 in PyEval_EvalFrameEx ()
#30 0x000000000057c0db in PyEval_EvalFrameEx ()
#31 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#32 0x000000000057e0eb in ?? ()
#33 0x000000000043810a in PyObject_Call ()
#34 0x0000000000579616 in PyEval_EvalFrameEx ()
#35 0x000000000057c0db in PyEval_EvalFrameEx ()
#36 0x000000000057c0db in PyEval_EvalFrameEx ()
#37 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#38 0x000000000057df80 in ?? ()
#39 0x000000000043810a in PyObject_Call ()
#40 0x00000000004d3745 in ?? ()
#41 0x000000000043810a in PyObject_Call ()
#42 0x00000000004d3360 in PyEval_CallObjectWithKeywords ()
#43 0x00000000005e7ef3 in ?? ()
#44 0x00002aaaaacd8182 in start_thread (arg=0x2aab14602700) at pthread_create.c:312
#45 0x00002aaaaafe847d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
#0 0x00002aaab3ce09e4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1 0x00002aaab3ce0cc1 in std::_Rb_tree_insert_and_rebalance(bool, std::Rb_tree_node_base, std::Rb_tree_node_base, std::_Rb_tree_node_base&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00002aaab73d0fa0 in M_insert (__v=..., __p=0x2aab7aa39fe0, __x=, this=0xb88ea0) at /usr/include/c++/4.8/bits/stl_tree.h:1025
#3 M_insert_unique (__v=..., __position=..., this=0xb88ea0) at /usr/include/c++/4.8/bits/stl_tree.h:1482
#4 insert (__x=..., __position=..., this=0xb88ea0) at /usr/include/c++/4.8/bits/stl_map.h:648
#5 operator[](__k=, this=0xb88ea0) at /usr/include/c++/4.8/bits/stl_map.h:469
#6 TimblApiWrapper::classify3safe (this=0xb88e90, line=..., normalize=, requireddepth=1 '\001') at src/timblapi.cc:121
#7 0x00002aaab73d4d29 in invoke<boost::python::to_python_value<boost::python::tuple const&>, boost::python::tuple (TimblApiWrapper::
)(std::basic_string const&, bool, unsigned char), boost::python::arg_from_python<TimblApiWrapper&>, boost::python::arg_from_pythonstd::basic_string<char const&>, boost::python::arg_from_python, boost::python::arg_from_python > (ac2=..., ac1=..., ac0=..., tc=, f=

@0xba58c8: (boost::python::tuple (TimblApiWrapper::*)(TimblApiWrapper * const, const std::basic_string<char, std::char_traits<char>, std::allocator<char> > &, bool, unsigned char)) 0x2aaab73d0ac0 <TimblApiWrapper::classify3safe(std::string const&, bool, unsigned char)>, rc=...) at /usr/include/boost/python/detail/invoke.hpp:88

#8 operator() (args_=, this=0xba58c8) at /usr/include/boost/python/detail/caller.hpp:223
#9 boost::python::objects::caller_py_function_impl<boost::python::detail::caller<boost::python::tuple (TimblApiWrapper::*)(std::string const&, bool, unsigned char), boost::python::default_call_policies, boost::mpl::vector5<boost::python::tuple, TimblApiWrapper&, std::string const&, bool, unsigned char> > >::operator() (this=0xba58c0, args=, kw=) at /usr/include/boost/python/object/py_function.hpp:38
#10 0x00002aaab78c513a in boost::python::objects::function::call(object, object) const () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#11 0x00002aaab78c54a8 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#12 0x00002aaab78cf743 in boost::python::handle_exception_impl(boost::function0) () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#13 0x00002aaab78c3db3 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#14 0x000000000043810a in PyObject_Call ()
#15 0x0000000000579f45 in PyEval_EvalFrameEx ()
#16 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#17 0x000000000057bfaa in PyEval_EvalFrameEx ()
#18 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#19 0x000000000057bfaa in PyEval_EvalFrameEx ()
#20 0x000000000057c0db in PyEval_EvalFrameEx ()
#21 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#22 0x000000000057df80 in ?? ()
#23 0x000000000043810a in PyObject_Call ()

---Type to continue, or q to quit---
#24 0x00000000004d3745 in ?? ()
#25 0x000000000043810a in PyObject_Call ()
#26 0x00000000004ab81b in ?? ()
#27 0x000000000047a7aa in ?? ()
#28 0x000000000043810a in PyObject_Call ()
#29 0x0000000000579f45 in PyEval_EvalFrameEx ()
#30 0x000000000057c0db in PyEval_EvalFrameEx ()
#31 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#32 0x000000000057e0eb in ?? ()
#33 0x000000000043810a in PyObject_Call ()
#34 0x0000000000579616 in PyEval_EvalFrameEx ()
#35 0x000000000057c0db in PyEval_EvalFrameEx ()
#36 0x000000000057c0db in PyEval_EvalFrameEx ()
#37 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#38 0x000000000057df80 in ?? ()
#39 0x000000000043810a in PyObject_Call ()
#40 0x00000000004d3745 in ?? ()
#41 0x000000000043810a in PyObject_Call ()
#42 0x00000000004d3360 in PyEval_CallObjectWithKeywords ()
#43 0x00000000005e7ef3 in ?? ()
#44 0x00002aaaaacd8182 in start_thread (arg=0x2aab1da0c700) at pthread_create.c:312
#45 0x00002aaaaafe847d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
#0 0x00002aaab3ce09e4 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#1 0x00002aaab3ce0cc1 in std::_Rb_tree_insert_and_rebalance(bool, std::Rb_tree_node_base, std::Rb_tree_node_base, std::_Rb_tree_node_base&) () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#2 0x00002aaab73d11f1 in M_insert (__v=..., __p=, __x=, this=0xec23c0) at /usr/include/c++/4.8/bits/stl_tree.h:1025
#3 M_insert_unique (__v=..., __position=..., this=0xec23c0) at /usr/include/c++/4.8/bits/stl_tree.h:1482
#4 insert (__x=..., __position=..., this=0xec23c0) at /usr/include/c++/4.8/bits/stl_map.h:648
#5 operator[](__k=@0x2aaabb1c98e8: 46912772028160, this=0xec23c0) at /usr/include/c++/4.8/bits/stl_map.h:469
#6 TimblApiWrapper::classify3safe (this=0xec23b0, line=..., normalize=, requireddepth=1 '\001') at src/timblapi.cc:124
#7 0x00002aaab73d5009 in invoke<boost::python::to_python_value<boost::python::tuple const&>, boost::python::tuple (TimblApiWrapper::*)(std::basic_string const&, bool, unsigned char), boost::python::arg_from_python<TimblApiWrapper&>, boost::python::arg_from_pythonstd::basic_string<char const&>, boost::python::arg_from_python, boost::python::arg_from_python > (ac2=..., ac1=..., ac0=..., tc=, f=

@0xeab058: (boost::python::tuple (TimblApiWrapper::*)(TimblApiWrapper * const, const std::basic_string<char, std::char_traits<char>, std::allocator<char> > &, bool, unsigned char)) 0x2aaab73d0d60 <TimblApiWrapper::classify3safe(std::string const&, bool, unsigned char)>, rc=...) at /usr/include/boost/python/detail/invoke.hpp:88

#8 operator() (args_=, this=0xeab058) at /usr/include/boost/python/detail/caller.hpp:223
#9 boost::python::objects::caller_py_function_impl<boost::python::detail::caller<boost::python::tuple (TimblApiWrapper::*)(std::string const&, bool, unsigned char), boost::python::default_call_policies, boost::mpl::vector5<boost::python::tuple, TimblApiWrapper&, std::string const&, bool, unsigned char> > >::operator() (this=0xeab050, args=, kw=) at /usr/include/boost/python/object/py_function.hpp:38
#10 0x00002aaab78c513a in boost::python::objects::function::call(object, object) const () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#11 0x00002aaab78c54a8 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#12 0x00002aaab78cf743 in boost::python::handle_exception_impl(boost::function0) () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#13 0x00002aaab78c3db3 in ?? () from /usr/lib/x86_64-linux-gnu/libboost_python-py34.so.1.54.0
#14 0x000000000043810a in PyObject_Call ()
#15 0x0000000000579f45 in PyEval_EvalFrameEx ()
#16 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#17 0x000000000057bfaa in PyEval_EvalFrameEx ()
#18 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#19 0x000000000057bfaa in PyEval_EvalFrameEx ()
#20 0x000000000057c0db in PyEval_EvalFrameEx ()
#21 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#22 0x000000000057df80 in ?? ()
#23 0x000000000043810a in PyObject_Call ()

---Type to continue, or q to quit---
#24 0x00000000004d3745 in ?? ()
#25 0x000000000043810a in PyObject_Call ()
#26 0x00000000004ab81b in ?? ()
#27 0x000000000047a7aa in ?? ()
#28 0x000000000043810a in PyObject_Call ()
#29 0x0000000000579f45 in PyEval_EvalFrameEx ()
#30 0x000000000057c0db in PyEval_EvalFrameEx ()
#31 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#32 0x000000000057e0eb in ?? ()
#33 0x000000000043810a in PyObject_Call ()
#34 0x0000000000579616 in PyEval_EvalFrameEx ()
#35 0x000000000057c0db in PyEval_EvalFrameEx ()
#36 0x000000000057c0db in PyEval_EvalFrameEx ()
#37 0x000000000057d3d3 in PyEval_EvalCodeEx ()
#38 0x000000000057df80 in ?? ()
#39 0x000000000043810a in PyObject_Call ()
#40 0x00000000004d3745 in ?? ()
#41 0x000000000043810a in PyObject_Call ()
#42 0x00000000004d3360 in PyEval_CallObjectWithKeywords ()
#43 0x00000000005e7ef3 in ?? ()
#44 0x00002aaaaacd8182 in start_thread (arg=0x2aaabb1cb700) at pthread_create.c:312
#45 0x00002aaaaafe847d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

15:23:05.141484 [lm](Processing word argument, features: %28'met',))
(Experiment in pool for thread 46980286211840)
15:23:05.141703 [lm](Classification took 0.0002s, unfiltered distribution size=13)
15:23:05.141788 [lm](Levenshtein filtering took 0.0s, final distribution size=0)
15:23:05.144579 [lm](Processing word ineens, features: %28'zelfde',))
(Experiment in pool for thread 46980275705600)
15:23:05.145595 [lm](Classification took 0.001s, unfiltered distribution size=3)
15:23:05.145663 [lm](Levenshtein filtering took 0.0s, final distribution size=0)
15:23:05.146149 [lm](Processing word schrok, features: %28'argument',))
(Experiment in pool for thread 46980275705600)
15:23:05.146389 [lm](Classification took 0.0002s, unfiltered distribution size=7)
15:23:05.146463 [lm](Levenshtein filtering took 0.0s, final distribution size=0)
15:23:05.149381 [lm](Processing word wakker, features: %28'ineens',))
(Experiment in pool for thread 46980271503104)
15:23:05.149768 [lm](Classification took 0.0004s, unfiltered distribution size=167)
15:23:05.149833 [lm](Levenshtein filtering took 0.0s, final distribution size=0)
15:23:05.155126 [lm](Processing word spiegel, features: %28'keek',))
(Experiment in pool for thread 46980275705600)
15:23:05.157286 [lm](Classification took 0.0022s, unfiltered distribution size=1906)
15:23:05.157616 [lm](Levenshtein filtering took 0.0003s, final distribution size=0) [297/23533]
15:23:05.161208 [lm](Processing word mezelf, features: %28'Ik',))
(Experiment in pool for thread 46980292515584)
15:23:05.161394 [lm](Classification took 0.0002s, unfiltered distribution size=2)
15:23:05.161441 [lm](Levenshtein filtering took 0.0s, final distribution size=0)
15:23:05.165963 [lm](Processing word allemaal, features: %28'het',))
(Experiment in pool for thread 46980292515584)
15:23:05.166122 [lm](Classification took 0.0002s, unfiltered distribution size=2)
15:23:05.166162 [lm](Levenshtein filtering took 0.0s, final distribution size=0)
15:23:05.661295 [lm](Processing word directeur, features: %28'Ik',))
(Creating new experiment in pool for thread 46980296718080)
15:23:05.668885 [lm](Processing word vakantie, features: %28'een',))
(Creating new experiment in pool for thread 46981240063744)
15:23:05.669847 [lm](Processing word hebben, features: %28'paar',))
(Creating new experiment in pool for thread 46981242164992)
15:23:05.678724 [lm](Processing word iedereen, features: %28'ik',))
(Creating new experiment in pool for thread 46981244266240)
15:23:05.694917 [lm](Processing word roosters, features: %28'zou',))
(Creating new experiment in pool for thread 46981246367488)
15:23:05.696392 [lm](Processing word veranderen, features: %28'ook',))
(Creating new experiment in pool for thread 46981248468736)
15:23:05.708978 [lm](Processing word leerlingen, features: %28'dat',))
(Creating new experiment in pool for thread 46981250569984)
15:23:05.720483 [lm](Processing word hebben, features: %28'zijn',))
(Creating new experiment in pool for thread 46981252671232)
15:23:05.729281 [lm](Processing word slechte, features: %28'we',))
(Creating new experiment in pool for thread 46981254772480)
15:23:05.731178 [lm](Processing word roosters, features: %28'hebben',))
(Creating new experiment in pool for thread 46981256873728)

Python-Ucto Issue In Gecco installation

Collecting python-ucto>=0.2.2 (from gecco)
  Using cached https://files.pythonhosted.org/packages/a2/fc/ef282871c5ba74aab6d6ba7edd1fb280a9dacdc3c1069467a2c0fe9c354c/python-ucto-0.4.7.tar.gz
Requirement already satisfied: colibricore>=2.4 in ./Ib_LaMachine/lib/python3.6/site-packages (from gecco) (2.5.0)
Requirement already satisfied: pynlpl>=0.7.9 in ./Ib_LaMachine/lib/python3.6/site-packages (from gecco) (1.2.9)
Requirement already satisfied: python3-timbl in ./Ib_LaMachine/lib/python3.6/site-packages (from gecco) (2018.4.23)
Requirement already satisfied: lxml>=2.2 in ./Ib_LaMachine/lib/python3.6/site-packages (from gecco) (4.3.3)
Requirement already satisfied: pyyaml in ./Ib_LaMachine/lib/python3.6/site-packages (from gecco) (5.1)
Requirement already satisfied: python-Levenshtein in ./Ib_LaMachine/lib/python3.6/site-packages (from gecco) (0.12.0)
Requirement already satisfied: psutil in ./Ib_LaMachine/lib/python3.6/site-packages (from gecco) (5.6.1)
Requirement already satisfied: Cython in ./Ib_LaMachine/lib/python3.6/site-packages (from python-ucto>=0.2.2->gecco) (0.29.7)
Requirement already satisfied: httplib2>=0.6 in ./Ib_LaMachine/lib/python3.6/site-packages (from pynlpl>=0.7.9->gecco) (0.12.1)
Requirement already satisfied: rdflib in ./Ib_LaMachine/lib/python3.6/site-packages (from pynlpl>=0.7.9->gecco) (4.2.2)
Requirement already satisfied: setuptools in ./Ib_LaMachine/lib/python3.6/site-packages (from python-Levenshtein->gecco) (41.0.0)
Requirement already satisfied: isodate in ./Ib_LaMachine/lib/python3.6/site-packages (from rdflib->pynlpl>=0.7.9->gecco) (0.6.0)
Requirement already satisfied: pyparsing in ./Ib_LaMachine/lib/python3.6/site-packages (from rdflib->pynlpl>=0.7.9->gecco) (2.4.0)
Requirement already satisfied: six in ./Ib_LaMachine/lib/python3.6/site-packages (from isodate->rdflib->pynlpl>=0.7.9->gecco) (1.12.0)
Building wheels for collected packages: python-ucto
  Building wheel for python-ucto (setup.py) ... error
  Complete output from command /root/ibrahim/Ib_LaMachine/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-w14xvfbg/python-ucto/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" bdist_wheel -d /tmp/pip-wheel-kh08xlv4 --python-tag cp36:
  running bdist_wheel
  running build
  running build_ext
  cythoning ucto_wrapper.pyx to ucto_wrapper.cpp
  /root/ibrahim/Ib_LaMachine/lib64/python3.6/site-packages/Cython/Compiler/Main.py:367: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-w14xvfbg/python-ucto/ucto_wrapper.pyx
    tree = Parsing.p_module(s, pxd, full_module_name)
  building 'ucto' extension
  creating build
  creating build/temp.linux-x86_64-3.6
  gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/root/ibrahim/Ib_LaMachine/include -I/usr/include/ -I/usr/include/libxml2 -I/usr/local/include/ -I/root/ibrahim/Ib_LaMachine/include -I/usr/include/python3.6m -c ucto_wrapper.cpp -o build/temp.linux-x86_64-3.6/ucto_wrapper.o --std=c++0x -D U_USING_ICU_NAMESPACE=1
  ucto_wrapper.cpp: In function 'PyObject* __pyx_pf_4ucto_9Tokenizer_4process(__pyx_obj_4ucto_Tokenizer*, PyObject*)':
  ucto_wrapper.cpp:3407:77: error: invalid use of void expression
     __pyx_t_1 = __Pyx_PyInt_From_int(__pyx_v_self->tok.tokenizeLine(__pyx_t_2)); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 109, __pyx_L1_error)
                                                                               ^
  In file included from ucto_wrapper.cpp:616:0:
  /usr/local/include/ucto/tokenize.h: In function 'PyObject* __pyx_gb_4ucto_9Tokenizer_8generator(__pyx_CoroutineObject*, PyThreadState*, PyObject*)':
  /usr/local/include/ucto/tokenize.h:284:9: error: 'int Tokenizer::TokenizerClass::flushSentences(int, const string&)' is private
       int flushSentences( int, const std::string& = "default" );
           ^
  ucto_wrapper.cpp:3545:98: error: within this context
     (void)(__pyx_cur_scope->__pyx_v_self->tok.flushSentences(__pyx_cur_scope->__pyx_v_sentencecount));
                                                                                                    ^
  In file included from ucto_wrapper.cpp:616:0:
  /usr/local/include/ucto/tokenize.h: In function 'PyObject* __pyx_gb_4ucto_9Tokenizer_11generator1(__pyx_CoroutineObject*, PyThreadState*, PyObject*)':
  /usr/local/include/ucto/tokenize.h:282:9: error: 'int Tokenizer::TokenizerClass::countSentences(bool)' is private
       int countSentences(bool forceentirebuffer = false);
           ^
  ucto_wrapper.cpp:3720:83: error: within this context
     __pyx_cur_scope->__pyx_v_c = __pyx_cur_scope->__pyx_v_self->tok.countSentences(1);
                                                                                     ^
  ucto_wrapper.cpp:3741:69: error: 'class Tokenizer::TokenizerClass' has no member named 'getSentence'
       __pyx_cur_scope->__pyx_v_v = __pyx_cur_scope->__pyx_v_self->tok.getSentence(__pyx_cur_scope->__pyx_v_i);
                                                                       ^
  In file included from ucto_wrapper.cpp:616:0:
  /usr/local/include/ucto/tokenize.h:284:9: error: 'int Tokenizer::TokenizerClass::flushSentences(int, const string&)' is private
       int flushSentences( int, const std::string& = "default" );
           ^
  ucto_wrapper.cpp:3919:86: error: within this context
     (void)(__pyx_cur_scope->__pyx_v_self->tok.flushSentences(__pyx_cur_scope->__pyx_v_c));
                                                                                        ^
  error: command 'gcc' failed with exit status 1
  
  ----------------------------------------
  Failed building wheel for python-ucto
  Running setup.py clean for python-ucto
Failed to build python-ucto
Installing collected packages: python-ucto, gecco
  Running setup.py install for python-ucto ... error
    Complete output from command /root/ibrahim/Ib_LaMachine/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-w14xvfbg/python-ucto/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-yxhr80dk/install-record.txt --single-version-externally-managed --compile --install-headers /root/ibrahim/Ib_LaMachine/include/site/python3.6/python-ucto:
    running install
    running build
    running build_ext
    cythoning ucto_wrapper.pyx to ucto_wrapper.cpp
    /root/ibrahim/Ib_LaMachine/lib64/python3.6/site-packages/Cython/Compiler/Main.py:367: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-w14xvfbg/python-ucto/ucto_wrapper.pyx
      tree = Parsing.p_module(s, pxd, full_module_name)
    building 'ucto' extension
    creating build
    creating build/temp.linux-x86_64-3.6
    gcc -pthread -Wno-unused-result -Wsign-compare -DDYNAMIC_ANNOTATIONS_ENABLED=1 -DNDEBUG -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic -D_GNU_SOURCE -fPIC -fwrapv -fPIC -I/root/ibrahim/Ib_LaMachine/include -I/usr/include/ -I/usr/include/libxml2 -I/usr/local/include/ -I/root/ibrahim/Ib_LaMachine/include -I/usr/include/python3.6m -c ucto_wrapper.cpp -o build/temp.linux-x86_64-3.6/ucto_wrapper.o --std=c++0x -D U_USING_ICU_NAMESPACE=1
    ucto_wrapper.cpp: In function 'PyObject* __pyx_pf_4ucto_9Tokenizer_4process(__pyx_obj_4ucto_Tokenizer*, PyObject*)':
    ucto_wrapper.cpp:3407:77: error: invalid use of void expression
       __pyx_t_1 = __Pyx_PyInt_From_int(__pyx_v_self->tok.tokenizeLine(__pyx_t_2)); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 109, __pyx_L1_error)
                                                                                 ^
    In file included from ucto_wrapper.cpp:616:0:
    /usr/local/include/ucto/tokenize.h: In function 'PyObject* __pyx_gb_4ucto_9Tokenizer_8generator(__pyx_CoroutineObject*, PyThreadState*, PyObject*)':
    /usr/local/include/ucto/tokenize.h:284:9: error: 'int Tokenizer::TokenizerClass::flushSentences(int, const string&)' is private
         int flushSentences( int, const std::string& = "default" );
             ^
    ucto_wrapper.cpp:3545:98: error: within this context
       (void)(__pyx_cur_scope->__pyx_v_self->tok.flushSentences(__pyx_cur_scope->__pyx_v_sentencecount));
                                                                                                      ^
    In file included from ucto_wrapper.cpp:616:0:
    /usr/local/include/ucto/tokenize.h: In function 'PyObject* __pyx_gb_4ucto_9Tokenizer_11generator1(__pyx_CoroutineObject*, PyThreadState*, PyObject*)':
    /usr/local/include/ucto/tokenize.h:282:9: error: 'int Tokenizer::TokenizerClass::countSentences(bool)' is private
         int countSentences(bool forceentirebuffer = false);
             ^
    ucto_wrapper.cpp:3720:83: error: within this context
       __pyx_cur_scope->__pyx_v_c = __pyx_cur_scope->__pyx_v_self->tok.countSentences(1);
                                                                                       ^
    ucto_wrapper.cpp:3741:69: error: 'class Tokenizer::TokenizerClass' has no member named 'getSentence'
         __pyx_cur_scope->__pyx_v_v = __pyx_cur_scope->__pyx_v_self->tok.getSentence(__pyx_cur_scope->__pyx_v_i);
                                                                         ^
    In file included from ucto_wrapper.cpp:616:0:
    /usr/local/include/ucto/tokenize.h:284:9: error: 'int Tokenizer::TokenizerClass::flushSentences(int, const string&)' is private
         int flushSentences( int, const std::string& = "default" );
             ^
    ucto_wrapper.cpp:3919:86: error: within this context
       (void)(__pyx_cur_scope->__pyx_v_self->tok.flushSentences(__pyx_cur_scope->__pyx_v_c));
                                                                                          ^
    error: command 'gcc' failed with exit status 1
    
    ----------------------------------------
Command "/root/ibrahim/Ib_LaMachine/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-install-w14xvfbg/python-ucto/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-yxhr80dk/install-record.txt --single-version-externally-managed --compile --install-headers /root/ibrahim/Ib_LaMachine/include/site/python3.6/python-ucto" failed with error code 1 in /tmp/pip-install-w14xvfbg/python-ucto/

Tried to solve the wheel issue by updating as everyone has stated on multiple threads, also updated the gcc. Problem still persists. FYI I am trying to get Gecco working on a server. Hence I am unable to use the LaMachine distribution as it won't install on root. I am using pip install Gecco.

Implement CLAM webservice support

Add 'buildwebservice' and 'runwebservice' commands to gecco. The former should generating a CLAM wrapper that invokes gecco run, the latter starts the gecco module servers and CLAM (development server or uwsgi)

Implement module arbitrator

When two or more modules come with conflicting suggestions, an arbitrator can clean up the results according to set rules

Threading too slow

Current threading model is too slow and not concurrent enough, in part due to Python's global interpreter lock - migrate to full multiprocessing solution: hold Folia document in one thread, collect FQL queries, execute them sequentially in that thread. Have other threads communicate concurrently with the various servers.

Setup.py cannot install python-timbl

(I'm just putting this here so I don't forget)

Not sure if it was supposed to be working already, but when running setup.py in a virtualenv, I got this error:

Installed /home/wstoop/gecco/lib/python3.2/site-packages/Gecco-0.1-py3.2.egg
Processing dependencies for Gecco==0.1
Searching for python3-timbl
Reading http://pypi.python.org/simple/python3-timbl/
Best match: python3-timbl 2013.03.29-1
Downloading https://pypi.python.org/packages/source/p/python3-timbl/python3-timbl-2013.03.29-1.tar.gz#md5=e8b1d10bf8e4893445bf8ce4fc6071f7
Processing python3-timbl-2013.03.29-1.tar.gz
error: Couldn't find a setup script in /tmp/easy_install-8edeov/python3-timbl-2013.03.29-1.tar.gz

It's related to the fact that python-timbl uses setup3.py .

Processes don't exit when threads execute in wrong order (finalising before threads ready)

Gecco processes don't exit properly in certain cases. Output shows "finalising" before "threads ready" in all these cases. Race condition somewhere?

Example log output:

14:29:52.544599 Loading local modules
14:29:52.544660 Modules loaded (0.031049013137817383s)
14:29:52.553045 Starting Tokeniser
14:29:52.732148 Tokeniser finished
14:29:52.732229 Reading FoLiA document
14:29:53.450088 Initialising modules on document
14:29:53.450973         Initialising module errorlist
14:29:53.451012         Initialising module hunspell
14:29:53.451036         Initialising module runon
14:29:53.451056         Initialising module splits
14:29:53.451075         Initialising module confusible_zei_zij
14:29:53.451095         Initialising module confusible_nog_noch
14:29:53.451113         Initialising module confusible_hard_hart
14:29:53.451130         Initialising module confusible_licht_ligt
14:29:53.451147         Initialising module confusible_grootte_grote
14:29:53.451164         Initialising module confusible_deze_dit
14:29:53.451181         Initialising module confusible_de_het
14:29:53.451198         Initialising module confusible_u_uw
14:29:53.451215         Initialising module confusible_kan_ken
14:29:53.451232         Initialising module confusible_me_mijn
14:29:53.451249         Initialising module confusible_word_wordt
14:29:53.451267         Initialising module confusible_hun_zij
14:29:53.451284         Initialising module confusiblesuffix_d_dt
14:29:53.451306         Preparing input of Word
14:29:54.305952 Input ready (0.8558549880981445s)
14:29:54.319317 Processing modules
14:29:54.320611 Processing output...
14:29:54.631710 [hunspell] Adding correction for untitled.p.3.s.1.w.1
14:29:54.836819 [hunspell] Adding correction for untitled.p.6.s.2.w.16
14:29:57.991656 [hunspell] Adding correction for untitled.p.6.s.16.w.15
14:29:58.867755 [hunspell] Adding correction for untitled.p.9.s.2.w.31
14:29:58.869521 [hunspell] Adding correction for untitled.p.9.s.5.w.4
14:29:58.870426 [hunspell] Adding correction for untitled.p.9.s.2.w.15
14:29:59.709621 [hunspell] Adding correction for untitled.p.9.s.10.w.18
14:29:59.740576 [hunspell] Adding correction for untitled.p.10.s.1.w.22
14:30:00.726238 [hunspell] Adding correction for untitled.p.10.s.1.w.36
14:30:01.106829 [hunspell] Adding correction for untitled.p.10.s.1.w.7
14:30:01.110097 [hunspell] Adding correction for untitled.p.10.s.1.w.10
14:30:01.111579 [hunspell] Adding correction for untitled.p.10.s.1.w.34
14:30:01.112581 [hunspell] Adding correction for untitled.p.10.s.1.w.24
14:30:02.708256 [hunspell] Adding correction for untitled.p.13.s.12.w.20
14:30:02.710925 [hunspell] Adding correction for untitled.p.13.s.12.w.17
14:30:02.883386 [hunspell] Adding correction for untitled.p.13.s.21.w.36
14:30:05.040993 [hunspell] Adding correction for untitled.p.13.s.21.w.30
14:30:05.278094 [hunspell] Adding correction for untitled.p.13.s.28.w.11
14:30:05.337748 [confusible_de_het] Adding correction for untitled.p.13.s.35.w.20
14:30:05.666915 [hunspell] Adding correction for untitled.p.15.s.5.w.20
14:30:05.734413 [confusible_de_het] Adding correction for untitled.p.15.s.15.w.3
14:30:07.277669 [confusible_hun_zij] Adding correction for untitled.p.18.s.8.w.2
14:30:08.570857 [hunspell] Adding correction for untitled.p.18.s.19.w.2
14:30:09.060058 [hunspell] Adding correction for untitled.p.20.s.23.w.19
14:30:09.799702 [hunspell] Adding correction for untitled.p.20.s.32.w.27
14:30:09.898175 [confusible_deze_dit] Adding correction for untitled.p.22.s.4.w.13
14:30:10.724452 [hunspell] Adding correction for untitled.p.23.s.3.w.19
14:30:10.769361 [hunspell] Adding correction for untitled.p.23.s.12.w.16
14:30:10.870364 [hunspell] Adding correction for untitled.p.23.s.12.w.18
14:30:10.927537 [confusible_hun_zij] Adding correction for untitled.p.23.s.19.w.3
14:30:10.965381 [hunspell] Adding correction for untitled.p.23.s.27.w.3
14:30:11.128947 [hunspell] Adding correction for untitled.p.24.s.4.w.8
14:30:11.130733 Finalising modules on document
14:30:11.131760 Saving document /scratch2/www/webservices-lst/live/writable/valkuil/projects/internal/Dd7dfd93a0da756482b98c7ceecfd1bad/output//Dd7dfd93a0da756482b98c7ceecfd1bad.xml....
14:29:54.336508 12 threads ready.
14:30:11.129039 Input queue processed (16.809709310531616s)
        hunspell        42.5152s        261 calls       27 corrections
        runon   0.1869s 537 calls       0 corrections
        splits  0.1708s 572 calls       2 corrections
        errorlist       0.1447s 538 calls       0 corrections
        confusible_de_het       0.038s  47 calls        2 corrections
        confusiblesuffix_d_dt   0.0118s 9 calls 0 corrections
        confusible_nog_noch     0.004s  2 calls 0 corrections
        confusible_licht_ligt   0.002s  1 calls 0 corrections
        confusible_kan_ken      0.0017s 1 calls 0 corrections
        confusible_deze_dit     0.0013s 1 calls 1 corrections
14:30:12.137501 Processing done (real total 17.75s , virtual output 43.07635712623596s, real input 16.809709310531616s)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.