Giter Club home page Giter Club logo

swig-srilm's Introduction

Bitdeli Badge

SWIG-SRILM: A SWIG-based wrapper for an SRILM language model

==========

Description

This package contains files to generate Perl and Python wrappers for SRILM language models.

Requirements

  • GNU make
  • Simplified Wrapper & Interface Generator (SWIG)
  • A local Python and/or Perl installation
  • The SRILM toolkit (v1.7.1). If you have an older version of SRILM e.g., the 1.5.x series then you should use the old_srilm branch. Note that SRILM should have been compiled as position independent code. You can do that by using the command MAKE_PIC=yes make when compiling SRILM.

Installation:

Linux

  • Modify the following environment variables at the top of the included Makefile:

  • SRILM_LIBS : The directory containing the SRILM libraries

  • SRILM_INC : The directory containing the SRILM header files

  • PYTHON_INC : The directory containing the python header files

  • PERL_INC : The directory containing the perl header files

  • To create a Python module, run 'make python' in this directory. Copy _srilm.so and srilm.py to your directory where you want to use the python module. You can run the included test.py script to check whether the compiled module works correctly. The output of test.py should be the following:

1. Number of n-grams:
   There are 11868 unigrams in this LM
   There are 59481 bigrams in this LM
   There are 16744 trigrams in this LM
   There are 13787 4-grams in this LM
   There are 12082 5-grams in this LM

2. N-gram log probabilities:
   p('good') = -3.49373698235
   p('of the') = -0.558740794659
   p('nitin madnani') = -99.0
   p('there are some') = -0.985605716705
   p('do more about your') = -0.469523012638
   p('or whatever has yet to') = -0.53226429224

3. Sentence log probabilities and perplexities:
   p('there are some good') = -9.85836982727
   ppl('there are some good') = 93.6858444214

4. OOvs:
   nOOVs('there are some foobar') = 1

5. Corpus log probabilties and perplexities:
   Logprob for the file test.txt = -33.6016654968
   Perplexity for the file test.txt = 94.7476806641
  • To create a Perl module, run make perl in this directory. Copy srilm.so and srilm.pm to the directory of your choice. Run the included Perl script 'test.pl' to test whether the compiled module works correctly. The output should be the same as above.

Mac OS X

Note: This has only been tested on OS X El Capitan and only with the built-in versions of python (2.7.10) and perl (5.18).

  • Check out the macosx branch.

  • Make sure you have compiled the SRILM libraries (MAKE_PIC=yes make).

  • Go to the directory containing the SRILM header files ($SRILM/include), open File.h and comment out the line that says #include "zio.h". This is necessary because even though SRILM is supposed to rename the zopen() function to my_zopen() on OS X since zlib is installed by default, it does not seem to work. So, this is a hacky workaround.

  • Modify the following environment variables at the top of Makefile.osx:

  • SRILM_LIBS : The directory containing the SRILM libraries

  • SRILM_INC : The directory containing the SRILM header files

    IMPORTANT: DO NOT change the PYTHON_INC and PERL_INC variables as they are set to be the default values for OS X El Capitan.

  • To compile the python module, run make -f Makefile.osx python and to compile the perl module, run make -f Makefile.osx perl. Note that the compiled modules will only work with the default OS X python and perl interpreters, i.e., /usr/bin/python and /usr/bin/perl.

  • You should be able to run /usr/bin/python test.py and /usr/bin/perl test.pl to test that the modules work and obtain the same output in the Linux case.

Usage:

Usage is clearly illustrated in files test.pl and test.py.

swig-srilm's People

Contributors

bitdeli-chef avatar desilinguist avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

swig-srilm's Issues

make perl error

Some information:

  • GLIBC version:2.21
  • Kernel version: 3.19.0-18-generic
  • SRILM version: 1.7.1

I'm trying to make the perl module in the swig-srilm.
I was doing it like it is in the README file. I already followed all the previous steps, changed the paths of the variables in the file Makefile, but when I try to execute the command "make perl" I keep getting this error:

g++ -shared srilm.o srilm_perl_wrap.o -loolm -ldstruct -lmisc -lz -lgomp -L/home/farid/Documents/6to/NLP/SRLIM/lib/i686-m64/ -o srilm.so
/usr/bin/ld: /home/farid/Documents/6to/NLP/SRLIM/lib/i686-m64//liboolm.a(Prob.o): relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC
/home/farid/Documents/6to/NLP/SRLIM/lib/i686-m64//liboolm.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
Makefile:20: recipe for target 'srilm.so' failed
make: *** [srilm.so] Error 1

I already tried including the -fPIC but the error is still there, I tried deleting the shared field in the command but instead I get another error:

g++ srilm.o srilm_perl_wrap.o -loolm -ldstruct -lmisc -lz -lgomp -L/home/farid/Documents/6to/NLP/SRLIM/lib/i686-m64/ -o srilm.so
/usr/bin/ld: /home/farid/Documents/6to/NLP/SRLIM/lib/i686-m64//liboolm.a(Vocab.o): undefined reference to symbol 'pthread_key_delete@@GLIBC_2.2.5'
/lib/x86_64-linux-gnu/libpthread.so.0: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
Makefile:20: recipe for target 'srilm.so' failed
make: *** [srilm.so] Error 1

My Makefile:

SRILM_LIBS=/home/farid/Documents/6to/NLP/SRLIM/lib/i686-m64/
SRILM_INC=/home/farid/Documents/6to/NLP/SRLIM/include/
PYTHON_INC=/opt/python/2.7/include/python2.7
PERL_INC=/usr/lib/x86_64-linux-gnu/perl/5.20.2/CORE/

python: clean _srilm.so

_srilm.so: srilm.o srilm_python_wrap.o
    g++ -shared $^ -loolm -ldstruct -lmisc -lz -lgomp -L$(SRILM_LIBS) -o $@

srilm_python_wrap.o: srilm_python_wrap.c
    g++ -c -fpic $< -I/usr/local/include/ -I$(SRILM_INC) -I$(PYTHON_INC)

srilm_python_wrap.c: srilm_python.i
    swig -python $<

perl: clean srilm.so

srilm.so: srilm.o srilm_perl_wrap.o
    g++ -shared $^ -loolm -ldstruct -lmisc -lz -lgomp -L$(SRILM_LIBS) -o $@

srilm_perl_wrap.o: srilm_perl_wrap.c
    g++ -c -fpic $< -I/usr/local/include/ -I$(SRILM_INC) -I$(PERL_INC)

srilm_perl_wrap.c: srilm_perl.i
    swig -perl $<

srilm.o: srilm.c
    g++ -c -fpic $< -I/usr/local/include/ -I$(SRILM_INC) -I$(PYTHON_INC)

clean:
    \rm -fr srilm.o srilm_*_wrap.* *.so srilm.py* srilm.pm

Installing on OSX El Capitan

I'm trying to install the wrapper but I keep getting multiple errors.

I've tried with the OS X shipped GCC and GCC installed using home-brew (versions 4.9 and 5).

Any other suggestions?

/Users/zeerakw/Masters/Projects/Haiku_Project/System/lib/srilm/include/zio.h:105:49: error: conflicting declaration of C function 'FILE* zopen(const char_, const char_)'
FILE * zopen (const char name, const char *mode);
^
In file included from /usr/include/wchar.h:90:0,
from /usr/local/Cellar/gcc49/4.9.3/include/c++/4.9.3/cwchar:44,
from /usr/local/Cellar/gcc49/4.9.3/include/c++/4.9.3/bits/postypes.h:40,
from /usr/local/Cellar/gcc49/4.9.3/include/c++/4.9.3/iosfwd:40,
from /usr/local/Cellar/gcc49/4.9.3/include/c++/4.9.3/ios:38,
from /usr/local/Cellar/gcc49/4.9.3/include/c++/4.9.3/ostream:38,
from /usr/local/Cellar/gcc49/4.9.3/include/c++/4.9.3/iostream:39,
from /Users/zeerakw/Masters/Projects/Haiku_Project/System/lib/srilm/include/Counts.h:17,
from /Users/zeerakw/Masters/Projects/Haiku_Project/System/lib/srilm/include/Prob.h:20,
from srilm.c:1:
/usr/include/stdio.h:463:7: note: previous declaration 'FILE
zopen(const char_, const char_, int)'
FILE zopen(const char *, const char *, int);
^
srilm.c: In function 'float getBigramProb(Ngram
, const char_)':
srilm.c:88:27: error: 'strdupa' was not declared in this scope
scp = strdupa(ngramstr);
^
srilm.c: In function 'float getTrigramProb(Ngram_, const char_)':
srilm.c:122:27: error: 'strdupa' was not declared in this scope
scp = strdupa(ngramstr);
^
srilm.c: In function 'float getNgramProb(Ngram_, const char_, unsigned int)':
srilm.c:151:27: error: 'strdupa' was not declared in this scope
scp = strdupa(ngramstr);
^
srilm.c: In function 'unsigned int sentenceStats(Ngram_, const char_, unsigned int, TextStats&)':
srilm.c:191:27: error: 'strdupa' was not declared in this scope
scp = strdupa(sentence);
^
make: *_* [srilm.o] Error 1

Mac OsX Maverics Installation Error

โšก make
...
clang: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated
srilm.c:142:11: error: use of undeclared identifier 'strdupa'; did you mean 'strdup'?
scp = strdupa(ngramstr);
^~~~~~~
strdup
/usr/include/string.h:117:7: note: 'strdup' declared here
char strdup(const char *);
^
srilm.c:176:11: error: use of undeclared identifier 'strdupa'; did you mean 'strdup'?
scp = strdupa(ngramstr);
^~~~~~~
strdup
/usr/include/string.h:117:7: note: 'strdup' declared here
char *strdup(const char *);
^
srilm.c:206:11: error: use of undeclared identifier 'strdupa'; did you mean 'strdup'?
scp = strdupa(sentence);
^~~~~~~
strdup
/usr/include/string.h:117:7: note: 'strdup' declared here
char *strdup(const char *);
^
3 errors generated.
make: *
* [srilm.o] Error 1

I guess instead of strdupa strdup should be used for the sake of the compilation of the modules.

Docs improvement: building srilm libraries as PIC

I ran into a problem getting swig-srilm to play nicely with the srilm libraries on my system, with the error:

/usr/bin/ld: /home/jmoore/src/loglanguage/srilm/srilm/lib/i686-m64/liboolm.a(Prob.o): relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC
/home/jmoore/src/loglanguage/srilm/srilm/lib/i686-m64/liboolm.a: error adding symbols: Bad value
collect2: error: ld returned 1 exit status
make: *** [_srilm.so] Error 1

As the error states, I needed to recompile Prob.o (and all the other srilm libraries) with the -fPIC flag to G++.

I couldn't find how to do that the "right" way, so I ended up editing the $SRILM/common/Makefile.machine.i686-m64 to add -fPIC to the GCC_FLAGS line.

My suggestion is to mention this in the documentation (readme or somewhere) in case somebody else runs into this same problem.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.