Giter Club home page Giter Club logo

spacy_kenlm's Introduction

spacy_kenlm: KenLM extension for spaCy 2.0

This package adds kenLM support as a spaCy 2.0 extension.

Usage

Train a kenLM language model first (or use the test model from test.arpa).

Add the spaCyKenLM to the spaCy pipeline to return scores.

import spacy
from spacy_kenlm import spaCyKenLM

nlp = spacy.load('en_core_web_sm')

spacy_kenlm = spaCyKenLM()  # default model from test.arpa

nlp.add_pipe(spacy_kenlm)

doc = nlp('How are you?')

# doc score
doc._.kenlm_score

# span score
doc[:2]._.kenlm_score

# token score
doc[2]._.kenlm_score

Installation

Install from the pip package.

pip install spacy_kenlm

spacy_kenlm's People

Contributors

tokestermw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

spacy_kenlm's Issues

install failure

I'm trying to install via pip on OSX 10.14 python3.6 and I am gettin the following error.

Installing collected packages: kenlm, spacy-kenlm
Running setup.py install for kenlm ... error
Complete output from command ~/venv/bin/python -u -c "import setuptools, tokenize;file='/private/tmp/pip-install-zq5mz13x/kenlm/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /private/tmp/pip-record-v10jfj37/install-record.txt --single-version-externally-managed --compile --install-headers ~/venv/include/site/python3.6/kenlm:
running install
running build
running build_ext
building 'kenlm' extension
creating build/temp.macosx-10.6-intel-3.6
creating build/temp.macosx-10.6-intel-3.6/util
creating build/temp.macosx-10.6-intel-3.6/lm
creating build/temp.macosx-10.6-intel-3.6/util/double-conversion
creating build/temp.macosx-10.6-intel-3.6/python
/usr/bin/clang -fno-strict-aliasing -Wsign-compare -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -arch i386 -arch x86_64 -g -I. -I ~/venv/include -I/Library/Frameworks/Python.framework/Versions/3.6/include/python3.6m -c util/pool.cc -o build/temp.macosx-10.6-intel-3.6/util/pool.o -O3 -DNDEBUG -DKENLM_MAX_ORDER=6 -std=c++11 -DHAVE_ZLIB -DHAVE_BZLIB -DHAVE_XZLIB
warning: include path for stdlibc++ headers not found; pass '-std=libc++' on the command line to use the libc++ standard library instead [-Wstdlibcxx-not-found]
In file included from util/pool.cc:1:
./util/pool.hh:4:10: fatal error: 'cassert' file not found
#include
^~~~~~~~~
1 warning and 1 error generated.
error: command '/usr/bin/clang' failed with exit status 1

----------------------------------------

Command "~/venv/bin/python -u -c "import setuptools, tokenize;file='/private/tmp/pip-install-zq5mz13x/kenlm/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /private/tmp/pip-record-v10jfj37/install-record.txt --single-version-externally-managed --compile --install-headers ~/venv/include/site/python3.6/kenlm" failed with error code 1 in /private/tmp/pip-install-zq5mz13x/kenlm/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.