Giter Club home page Giter Club logo

sfst's Introduction

SFST - Stuttgart Finite State Transducer

CI status
CMake builds CMake Actions Status
Pip builds Pip Actions Status
Wheels builds Wheels Actions Status

Installation

SFST can be compiled in Unix/Linux, Windows and Mac operating systems.

The SFST command line utilities has a few external requirements.

  1. The "Flex" scanner generator which can be downloaded from: https://github.com/westes/flex. In linux systems they can be installed using package manager. For example, in Ubuntu, apt install flex installs it.
  2. The "Bison" parser generator available from http://www.gnu.org/software/bison. In linux systems they can be installed using package manager. For example, in Ubuntu, apt install bison installs it.

After unpacking the software package, in the top directory of source code,

mkdir build
cd build
cmake ..
make

to compile the tools. Then call

make install

to install the tools in /usr/local/bin. Change the variable DESTDIR in Makefile if you want to install to a different directory. Finally call

make maninstall

to install the manpages in /usr/local/man/man1 and you are done.

The subdirectory data contains a simple example of an English morphological analyser, the source code of the German SMOR morphology (with just a few sample lexicon entries), and the general morphology XMOR which may be used as a starting point for the development of a computational morphology.

Usage

See the manual SFST-Manual.pdf in the doc subdirectory and the man pages for more information on the SFST tools. doc/SFST-Tutorial.pdf explains how computational morphologies are implemented in SFST.

If you want to implement your own tools based on the SFST code, have a look at fst-infl.C and fst-infl2.C. They show how analysers are implemented with the standard (fst-infl) and the compact (fst-infl2) transducer format.

Author

The SFST tools have been implemented by Helmut Schmid, Institute for Computational Linguistics, University of Stuttgart, Germany and they are available under the GNU public license version 2 or higher.

Please cite the following publication if you want to refer to the SFST tools:

A Programming Language for Finite State Transducers, Proceedings of the 5th International Workshop on Finite State Methods in Natural Language Processing (FSMNLP 2005), Helsinki, Finland. pdf

Bug reports

Please send bug reports and other feedback to [email protected].

sfst's People

Contributors

ambientlighter avatar asdofindia avatar nolda avatar santhoshtr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

sfst's Issues

Phrasal verbs not recognised

The sample lexicon in data/SMOR/lexicon contains several entries for phrasal verbs, e.g.

<Base_Stems>auswendig<PREF>:<><ge>lern<V><base><nativ><VVReg>

Such verbs, however, are not recognised:

> fst-mor smor.a
reading transducer...
finished.
analyze> auswendiglernen
no result for auswendiglernen
analyze> q

The problem appears to be the symbol <ge> in the lexical entry intervening between the particle and the base verb. It is taken into account by $GE in data/SMOR/deko.fst (cf. the comments there), but not in $BDKStems$ in smor.fst.

install locally rather than in /usr/local/bin (example for DESTDIR needed)

I am trying to build sfst in the local workspace of the current user, Ubuntu 22.04 LTS.[1] For this, I would ask you for an example of setting DESTDIR in accordance with the line "Change the variable DESTDIR in Makefile if you want to install to a different directory." from the Readme.

My naive interpretation was that this means to run the following sequence of commands:

$> git clone https://github.com/santhoshtr/sfst.git;
$> cd sfst;
$> mkdir build;
$> cd build;
$> cmake ..;
$> make DESTDIR=~/tmp;
$> make install DESTDIR=~/tmp;
$> cd ../..;

Which, however, yields

Install the project...
-- Install configuration: ""
...
-- Installing: /home/christian/tmp/usr/local/include/utf8.h
CMake Error at src/cmake_install.cmake:65 (file):
  file INSTALL cannot find
  "/home/christian/Desktop/github/nmk-corpus/scripts/sfst/build/src/fst-compiler.h":
  No such file or directory.
Call Stack (most recent call first):
  cmake_install.cmake:47 (include)

make[1]: *** [Makefile:120: install] Error 1

I guess this is has a rather obvious solution, apologies for the stupid question.

[1] I need to install somewhere in the user space, because without setting DESTDIR, I get the expected permission denied error, and on the machine I want to run it on, I don't have sudo rights.

...
Install the project...
-- Install configuration: ""
-- Installing: /usr/local/lib/libsfst.so
CMake Error at src/cmake_install.cmake:52 (file):
  file INSTALL cannot copy file
  "/home/christian/Desktop/github/nmk-corpus/scripts/sfst/build/src/libsfst.so"
  to "/usr/local/lib/libsfst.so": Permission denied.
Call Stack (most recent call first):
  cmake_install.cmake:47 (include)

PS: I tried different kinds of DESTDIR paths, both absolute and relative. I also tried using the DESTDIR flag on make install only rather than on both that and the earlier call to make. No difference.

Installation fails due to 'fst-compiler.h'

Hi, I get this message when I'm installing sfst.

CMake Error at src/cmake_install.cmake:365 (file):
  file INSTALL cannot find
  "<path_to_sfst>/src/fst-compiler.h":
  No such file or directory.
Call Stack (most recent call first):
  cmake_install.cmake:47 (include)

Could I get some help?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.