Giter Club home page Giter Club logo

mts's Introduction

Marian Translation Service

The code in this repository implements a Marian-based translation service. Currently, it offers an REST http server. The lack of https is currently on purpose, as in the vast majority of the use cases that I envision, the service will run locally and not through a registered domain address, so only self-signed certificates would work. If you need https, go through a proxy and put the Marian REST server behind a firewall.

Currently, the server can serve only a single configuration (model or ensemble of models).

The REST API is described here.

Compilation

Local

Notice that you need all the dependencies and libraries for a full compilation of Marian.

git clone https://github.com/ugermann/mts /path/to/your/local/clone/of/mts
mkdir /path/to/where/you/want/to/build
cd /path/to/where/you/want/to/build/mts
cmake /path/to/your/local/clone/of/mts
make -j

Dockerized Version

(coming soon, based on http://github.com/ugermann/marian-docker ...)

Configuring a Model to serve

These instructions assume that you have the following files from a marian training process:

  • at least one model (more for ensemble decoding; do not try ensembed decoding unless your REST server has access to a GPU)
  • the respective vocabulary file(s)
  • a shortlist file for faster hypothesis generation (optional)

Preparation

  • Convert the model file(s) to binary format with

    marian-conv -f model.npz -t model.bin
    

    This is optional, but makes the model load faster.

  • Create a decoder.yml file, e.g. like this one:

    relative-paths: true
    models: [ model.bin ]
    vocabs: [ joint-vocab.spm, joint-vocab.spm ]
    alignment: true
    beam-size: 4
    normalize: 1
    word-penalty: 0
    mini-batch: 128
    maxi-batch: 100
    maxi-batch-sort: src
    
    # The following are specific to the marian REST server
    # source-language and target-language are used for the Demo
    # interface; the ssplit-prefix-file is from the Moses sentence splitter
    # and comes with the marian REST server image. Pick the right one
    # for your source language. SSPLIT_ROOT_DIR is set to the appropriate
    # value in the `mariannmt/marian-rest-server` image.
    source-language: German
    target-language: English
    ssplit-prefix-file: ${SSPLIT_ROOT_DIR}/nonbreaking_prefixes/nonbreaking_prefix.de
    
  • Copy the appropriate nonbreaking_prefix.* file for sentence splitting into an appropriate location, or

    export SSPLIT_ROOT_DIR=/path/to/your/local/clone/of/mts/3rd_party/ssplit-cpp
    

Running the Server

/path/to/your/build/directory/rest-server -c /path/to/decoder.yml -p <port of your choice>

Known bugs

  • Rest-server currently inherits the version info from the marian submodule, which is obviously incorrect.

mts's People

Contributors

ugermann avatar kpu avatar fredblain avatar

Stargazers

Giorgio Comai avatar Masanori Ogino avatar  avatar Sara Tasche avatar

Watchers

 avatar  avatar James Cloos avatar Nikolay Bogoychev avatar  avatar

mts's Issues

intgemm_reintegrated_computestats

For use with the NMT students in 8-bit et al, the submodule should point at the intgemm_reintegrated_computestats branch of Marian. And it should set config options consistent with that. Maybe all that needs to happen is a change in submodule pointer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.