Giter Club home page Giter Club logo

spacy-api-docker's Introduction

spaCy API Docker

Ready-to-use Docker images for the spaCy NLP library.


spaCy API Docker is being sponsored by the following tool; please help to support us by taking a look and signing up to a free trial

GitAds

Features

  • Use the awesome spaCy NLP framework with other programming languages.
  • Better scaling: One NLP - multiple services.
  • Build using the official spaCy REST services.
  • Dependency parsing visualisation with displaCy.
  • Docker images for English, German, Spanish, Italian, Dutch and French.
  • Automated builds to stay up to date with spaCy.
  • Current spaCy version: 2.0.16

Please note that this is a completely new API and is incompatible with the previous one. If you still need them, use jgontrum/spacyapi:en-legacy or jgontrum/spacyapi:de-legacy.

Documentation, API- and frontend code based upon spaCy REST services by Explosion AI.


Images

Image Description
jgontrum/spacyapi:base_v2 Base image for spaCy 2.0, containing no language model
jgontrum/spacyapi:en_v2 English language model, spaCy 2.0
jgontrum/spacyapi:de_v2 German language model, spaCy 2.0
jgontrum/spacyapi:es_v2 Spanish language model, spaCy 2.0
jgontrum/spacyapi:fr_v2 French language model, spaCy 2.0
jgontrum/spacyapi:pt_v2 Portuguese language model, spaCy 2.0
jgontrum/spacyapi:it_v2 Italian language model, spaCy 2.0
jgontrum/spacyapi:nl_v2 Dutch language model, spaCy 2.0
jgontrum/spacyapi:all_v2 Contains EN, DE, ES, PT, NL, IT and FR language models, spaCy 2.0
OLD RELEASES
jgontrum/spacyapi:base Base image, containing no language model
jgontrum/spacyapi:latest English language model
jgontrum/spacyapi:en English language model
jgontrum/spacyapi:de German language model
jgontrum/spacyapi:es Spanish language model
jgontrum/spacyapi:fr French language model
jgontrum/spacyapi:all Contains EN, DE, ES and FR language models
jgontrum/spacyapi:en-legacy Old API with English model
jgontrum/spacyapi:de-legacy Old API with German model

Usage

docker run -p "127.0.0.1:8080:80" jgontrum/spacyapi:en_v2

All models are loaded at start up time. Depending on the model size and server performance, this can take a few minutes.

The displaCy frontend is available at /ui.

Docker Compose

version: '2'

services:
  spacyapi:
    image: jgontrum/spacyapi:en_v2
    ports:
      - "127.0.0.1:8080:80"
    restart: always

Running Tests

In order to run unit tests locally pytest is included.

docker run -it jgontrum/spacyapi:en_v2 app/env/bin/pytest app/displacy_service_tests

Special Cases

The API includes rudimentary support for specifying special cases for your deployment. Currently only basic special cases are supported; for example, in the spaCy parlance:

tokenizer.add_special_case("isn't", [{ORTH: "isn't"}])

They can be supplied in an environment variable corresponding to the desired language model. For example, en_special_cases or en_core_web_lg_special_cases. They are configured as a single comma-delimited string, such as "isn't,doesn't,won't".

Use the following syntax to specify basic special case rules, such as for preserving contractions:

docker run -p "127.0.0.1:8080:80" -e en_special_cases="isn't,doesn't" jgontrum/spacyapi:en_v2

You can also configure this in a .env file if using docker-compose as above.


REST API Documentation

GET /ui/

displaCy frontend is available here.


POST /dep

Example request:

{
  "text": "They ate the pizza with anchovies",
  "model": "en",
  "collapse_punctuation": 0,
  "collapse_phrases": 1
}
Name Type Description
text string text to be parsed
model string identifier string for a model installed on the server
collapse_punctuation boolean Merge punctuation onto the preceding token?
collapse_phrases boolean Merge noun chunks and named entities into single tokens?

Example request using the Python Requests library:

import json
import requests

url = "http://localhost:8000/dep"
message_text = "They ate the pizza with anchovies"
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

{
  "arcs": [
    { "dir": "left", "start": 0, "end": 1, "label": "nsubj" },
    { "dir": "right", "start": 1, "end": 2, "label": "dobj" },
    { "dir": "right", "start": 1, "end": 3, "label": "prep" },
    { "dir": "right", "start": 3, "end": 4, "label": "pobj" },
    { "dir": "left", "start": 2, "end": 3, "label": "prep" }
  ],
  "words": [
    { "tag": "PRP", "text": "They" },
    { "tag": "VBD", "text": "ate" },
    { "tag": "NN", "text": "the pizza" },
    { "tag": "IN", "text": "with" },
    { "tag": "NNS", "text": "anchovies" }
  ]
}
Name Type Description
arcs array data to generate the arrows
dir string direction of arrow ("left" or "right")
start integer offset of word the arrow starts on
end integer offset of word the arrow ends on
label string dependency label
words array data to generate the words
tag string part-of-speech tag
text string token

Curl command:

curl -s localhost:8000/dep -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'
{
  "arcs": [
    {
      "dir": "left",
      "end": 1,
      "label": "nsubj",
      "start": 0
    },
    {
      "dir": "right",
      "end": 2,
      "label": "acomp",
      "start": 1
    },
    {
      "dir": "right",
      "end": 3,
      "label": "prep",
      "start": 2
    },
    {
      "dir": "right",
      "end": 4,
      "label": "pobj",
      "start": 3
    },
    {
      "dir": "right",
      "end": 5,
      "label": "prep",
      "start": 4
    },
    {
      "dir": "right",
      "end": 6,
      "label": "pobj",
      "start": 5
    }
  ],
  "words": [
    {
      "tag": "NNPS",
      "text": "Pastafarians"
    },
    {
      "tag": "VBP",
      "text": "are"
    },
    {
      "tag": "JJR",
      "text": "smarter"
    },
    {
      "tag": "IN",
      "text": "than"
    },
    {
      "tag": "NNS",
      "text": "people"
    },
    {
      "tag": "IN",
      "text": "with"
    },
    {
      "tag": "NNS",
      "text": "Coca Cola bottles."
    }
  ]
}

POST /ent

Example request:

{
  "text": "When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously.",
  "model": "en"
}
Name Type Description
text string text to be parsed
model string identifier string for a model installed on the server

Example request using the Python Requests library:

import json
import requests

url = "http://localhost:8000/ent"
message_text = "When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously."
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

[
  { "end": 20, "start": 5, "type": "PERSON" },
  { "end": 67, "start": 61, "type": "ORG" },
  { "end": 75, "start": 71, "type": "DATE" }
]
Name Type Description
end integer character offset the entity ends after
start integer character offset the entity starts on
type string entity type
curl -s localhost:8000/ent -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'
[
  {
    "end": 12,
    "start": 0,
    "text": "Pastafarians",
    "type": "NORP"
  },
  {
    "end": 51,
    "start": 42,
    "text": "Coca Cola",
    "type": "ORG"
  }
]

POST /sents

Example request:

{
  "text": "In 2012 I was a mediocre developer. But today I am at least a bit better.",
  "model": "en"
}
Name Type Description
text string text to be parsed
model string identifier string for a model installed on the server

Example request using the Python Requests library:

import json
import requests

url = "http://localhost:8000/sents"
message_text = "In 2012 I was a mediocre developer. But today I am at least a bit better."
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

["In 2012 I was a mediocre developer.", "But today I am at least a bit better."]

POST /sents_dep

Combination of /sents and /dep, returns sentences and dependency parses

Example request:

{
  "text": "In 2012 I was a mediocre developer. But today I am at least a bit better.",
  "model": "en"
}
Name Type Description
text string text to be parsed
model string identifier string for a model installed on the server

Example request using the Python Requests library:

import json
import requests

url = "http://localhost:8000/sents_dep"
message_text = "In 2012 I was a mediocre developer. But today I am at least a bit better."
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

[
  {
    "sentence": "In 2012 I was a mediocre developer.",
    "dep_parse": {
      "arcs": [
        {
          "dir": "left",
          "end": 3,
          "label": "prep",
          "start": 0,
          "text": "In"
        },
        {
          "dir": "right",
          "end": 1,
          "label": "pobj",
          "start": 0,
          "text": "2012"
        },
        {
          "dir": "left",
          "end": 3,
          "label": "nsubj",
          "start": 2,
          "text": "I"
        },
        {
          "dir": "left",
          "end": 6,
          "label": "det",
          "start": 4,
          "text": "a"
        },
        {
          "dir": "left",
          "end": 6,
          "label": "amod",
          "start": 5,
          "text": "mediocre"
        },
        {
          "dir": "right",
          "end": 6,
          "label": "attr",
          "start": 3,
          "text": "developer"
        },
        {
          "dir": "right",
          "end": 7,
          "label": "punct",
          "start": 3,
          "text": "."
        }
      ],
      "words": [
        {
          "tag": "IN",
          "text": "In"
        },
        {
          "tag": "CD",
          "text": "2012"
        },
        {
          "tag": "PRP",
          "text": "I"
        },
        {
          "tag": "VBD",
          "text": "was"
        },
        {
          "tag": "DT",
          "text": "a"
        },
        {
          "tag": "JJ",
          "text": "mediocre"
        },
        {
          "tag": "NN",
          "text": "developer"
        },
        {
          "tag": ".",
          "text": "."
        }
      ]
    }
  },
  {
    "sentence": "But today I am at least a bit better.",
    "dep_parse": {
      "arcs": [
        {
          "dir": "left",
          "end": 11,
          "label": "cc",
          "start": 8,
          "text": "But"
        },
        {
          "dir": "left",
          "end": 11,
          "label": "npadvmod",
          "start": 9,
          "text": "today"
        },
        {
          "dir": "left",
          "end": 11,
          "label": "nsubj",
          "start": 10,
          "text": "I"
        },
        {
          "dir": "left",
          "end": 13,
          "label": "advmod",
          "start": 12,
          "text": "at"
        },
        {
          "dir": "left",
          "end": 15,
          "label": "advmod",
          "start": 13,
          "text": "least"
        },
        {
          "dir": "left",
          "end": 15,
          "label": "det",
          "start": 14,
          "text": "a"
        },
        {
          "dir": "left",
          "end": 16,
          "label": "npadvmod",
          "start": 15,
          "text": "bit"
        },
        {
          "dir": "right",
          "end": 16,
          "label": "acomp",
          "start": 11,
          "text": "better"
        },
        {
          "dir": "right",
          "end": 17,
          "label": "punct",
          "start": 11,
          "text": "."
        }
      ],
      "words": [
        {
          "tag": "CC",
          "text": "But"
        },
        {
          "tag": "NN",
          "text": "today"
        },
        {
          "tag": "PRP",
          "text": "I"
        },
        {
          "tag": "VBP",
          "text": "am"
        },
        {
          "tag": "IN",
          "text": "at"
        },
        {
          "tag": "JJS",
          "text": "least"
        },
        {
          "tag": "DT",
          "text": "a"
        },
        {
          "tag": "NN",
          "text": "bit"
        },
        {
          "tag": "RBR",
          "text": "better"
        },
        {
          "tag": ".",
          "text": "."
        }
      ]
    }
  }
]

GET /models

List the names of models installed on the server.

Example request:

GET /models

Example response:

["en", "de"]

GET /{model}/schema

Example request:

GET /en/schema
Name Type Description
model string identifier string for a model installed on the server

Example response:

{
  "dep_types": ["ROOT", "nsubj"],
  "ent_types": ["PERSON", "LOC", "ORG"],
  "pos_types": ["NN", "VBZ", "SP"]
}

GET /version

Show the used spaCy version.

Example request:

GET /version

Example response:

{
  "spacy": "2.2.4"
}

spacy-api-docker's People

Contributors

afshinm avatar avinashrubird avatar bastienbot avatar dbkaplun avatar dparlevliet avatar epugh avatar ines avatar jgontrum avatar matityahul avatar matthewarmand avatar mjfox3 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spacy-api-docker's Issues

Errors while running jgontrum/spacyapi:en_v2

Thank you for this project. Have been using v1 with much success.

Tried running jgontrum/spacyapi:en_v2 on both a MacBook Pro 10.14 & a fresh Ubuntu setup & got the same error output:

2018-11-03 17:59:52,943 INFO Included extra file "/etc/supervisor/conf.d/supervisor.conf" during parsing
2018-11-03 17:59:52,955 INFO RPC interface 'supervisor' initialized
2018-11-03 17:59:52,955 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2018-11-03 17:59:52,955 INFO supervisord started with pid 7
2018-11-03 17:59:53,958 INFO spawned: 'nginx' with pid 10
2018-11-03 17:59:53,960 INFO spawned: 'api' with pid 11
2018-11-03 17:59:53,962 INFO spawned: 'frontend' with pid 12
2018-11-03 17:59:54,573 INFO exited: frontend (exit status 2; not expected)
2018-11-03 17:59:55,574 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-11-03 17:59:55,575 INFO success: api entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-11-03 17:59:55,577 INFO spawned: 'frontend' with pid 27
2018-11-03 17:59:56,145 INFO exited: frontend (exit status 2; not expected)
2018-11-03 17:59:58,150 INFO spawned: 'frontend' with pid 35
2018-11-03 17:59:58,782 INFO exited: frontend (exit status 2; not expected)
2018-11-03 18:00:01,788 INFO spawned: 'frontend' with pid 43
2018-11-03 18:00:02,495 INFO exited: frontend (exit status 2; not expected)
2018-11-03 18:00:02,496 INFO gave up: frontend entered FATAL state, too many start retries too quickly

Please let me know what other information would be helpful.

Greek language support

Hi,

Is it planned to add greek language support for jgontrum/spacyapi:all_v2 ?

Best
Alexander

French model not working

docker run -p "127.0.0.1:8080:80" jgontrum/spacyapi:fr

http://localhost:8080/ui

UI loads correctly, however any requests from the sentence entry simply do not return. The loading icon spins and nothing else happens.

I have tried en and en_v2. Both work as they should.

When testing this no other containers were running.

edit: curl commands fail as well.

curl -s localhost:80/dep -d '{"text":"Bonjour Justin Trudeau.", "model":"fr"}'

{"title":"Dependency parsing failed","description":"'NoneType' object is not callable"}

Use different ports behind nginx

Hi, i'm trying to run this on google cloud run which requires to run on port 8080. Right now this port is used internally so it doesn't run when trying to use PORT=8080. Would it be possible to move the internal ports into a less frequently used port range? (like e.g. 32543)

jgontrum/spacyapi:en_v2 | Dependency parsing failed: Can't find model 'en'

For some reason, the following:

{
  'model': 'en', 
  'collapse_phrases': True, 
  'text': 'I paid him $100,- for nothing!', 
  'collapse_punctuation': False
}

Gives me

Dependency parsing failed: Can't find model 'en'

from the server.

I am running

docker run -p "127.0.0.1:8082:80" jgontrum/spacyapi:en_v2

Do I have to set another model name? I already tried en_v2.

Example curl doesn't work

 $ curl http://localhost:5000/api --header 'content-type: application/json' --data '{text: "This is a text that I want to be analyzed."}' -X POST
{"message": "The browser (or proxy) sent a request that this server could not understand."}

I assume you meant something like this instead, which does work:

 $ curl http://localhost:5000/api --header 'Content-Type: application/json' --data '{"text": "This is a text that I want to be analyzed."}' -X POST
{"numOfSentences": 1, "lang": "en", "performance": [0.18333840370178223], "sentences": [[{"pos": "DET", "email": false, "ner": "", "stop": true, "token": ...

The REST API returns JSON payload with Content-Type "text/string"

This makes some clients need to bend over backwards to process the results. The fix for this seems to be fairly easy, just change the HUG annotations, for example:
@hug.post("/dep", output=hug.output_format.json)
This would be, of course, a change in the spacy-services repository, but I can't open an issue there.

POS Tree

Would it be possible to return the POS TREE next to the token?

Similarity route?

Maybe I missed it but it looks like there is currently no implementation for spaCy's semantic similarity method.

Is this related to the fact that most of the models used are the "small" version, which generally do not perform similarity calculations well?

Upgrade to Falcon 3 for CORS

Hello,
I needed to authorize CORS on the Docker API, and generate an updated Docker image.
To solve it, I upgraded the falcon dependency to 3.0.1, in order to use simple CORS configuration here on line 1 :

APP = falcon.API(cors_enable=True)
APP.add_route('/dep', DepResource())
APP.add_route('/ent', EntResource())
APP.add_route('/sents', SentsResources())
APP.add_route('/sents_dep', SentsDepResources())
APP.add_route('/{model_name}/schema', SchemaResource())
APP.add_route('/models', ModelsResource())
APP.add_route('/version', VersionResource())

If you want me to integrate these changes in your repo, keep me posted ;)

No output for curl command

Hi All,

I cloned the repo and ran the cli:

Prabuddhs-MacBook-Air:spacy-api-docker pg$ docker run -p "127.0.0.1:8080:80" jgontrum/spacyapi:en
/usr/lib/python2.7/dist-packages/supervisor/options.py:296: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
2017-10-20 15:46:48,043 CRIT Supervisor running as root (no user in config file)
2017-10-20 15:46:48,043 WARN Included extra file "/etc/supervisor/conf.d/supervisor.conf" during parsing
2017-10-20 15:46:48,070 INFO RPC interface 'supervisor' initialized
2017-10-20 15:46:48,070 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2017-10-20 15:46:48,072 INFO supervisord started with pid 1
2017-10-20 15:46:49,081 INFO spawned: 'nginx' with pid 7
2017-10-20 15:46:49,087 INFO spawned: 'api' with pid 8
2017-10-20 15:46:49,093 INFO spawned: 'frontend' with pid 9
2017-10-20 15:46:50,149 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-10-20 15:46:50,150 INFO success: api entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-10-20 15:46:50,150 INFO success: frontend entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

when I run the curl command:
curl -s localhost:8000/ent -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'

I dont get any output.

How can I download en_core_web_lg?

Hey Johannes!

Thanks for this awesome work here!

I was testing the docker image for English out and noticed that it by default downloads en_core_web_sm.

I tried to get it to download en_core_web_lg through various methods, including by changing the code as follows and passing the ENV variable languages to be en_core_web_lg-2.1.0 -
https://github.com/nitinthewiz/spacy-api-docker/blob/master/displacy_service/scripts/download.py#L10

But this process fails, because the API server doesn't start, as it complains about spacy.load("en_core_web_lg-2.1.0") failing.

spacy.load seems to expect "en" only, and doesn't work with anything else.

Could you tell me if there's a way to convince the docker image to run with en_core_web_lg?

Thanks a lot!

/dep

Hello: first, thank you for making this available as a docker image!

I pulled en_v2, and /ui works but not /dep - I pulled en (v1) and /ui and /dep works.
I don't know if that is documented anywhere?

-David

Adding a new entry point for POS tagger only output

Hi,
I'm thinking in adding a new entry point /tag to retrieve a result of the POS tagging of a document with detailed output for each token.
My basic idea is to accept a json request with the following body:

{
text : "text",
model: "model",
include_sentences : true|false, #include a sentence level or not in the output
attr_filter : [ ] #list of token attributes to include in the output, like ["lemma", "pos", ... "is_stop", ...]
}

The output could be a list of tokens like:
[ { text : "text", start : 111, end : 222, lemma : "lemma", ... } , {}, .. , ]

with eventually an additionnal sentence level like:

[
 { text : "sentence text", start : 0, end : 100000, tokens : [ {}, ... {}] },
...
]

What do you think ?
I need something like that to use spacy from a java program

Best regards

Olivier

Custom model?

New to Spacey. Can you describe how one might add their pre-trained model to the docker package?

text classification rest api

Hi, it looks like this spacy api is close to what I need, but I want to create a docker spacy api, pre trained, to analyse a text string and categorize it, is this something I can do with this projects? If not any advice? Thanks

problem running legacy version

So I am running your legacy one (since according to the docs has more features, like LEMMA etc), but I am getting stuck:
docker run --env PORT=5050 jgontrum/spacyapi

But I can't hit it with curl like the docs here say

I tried without the port number, and with (as above) and I just get failed to connect to port

What am I missing?

Sentence Boundary Detection?

First, thank you for creating this REST API capabile Docker container for Spacy. I have set up a Spacy server using it and I am successfully able to get parse trees for sentences I submit over the REST API.

I would like to able to do sentence boundary detection too. Is there a way to use the Docker container to do that? If not, how hard would it be for me to enhance the REST API to be able to do that too? I'm an experienced C/C++ and Javascript programmer of many years, with about a year of Python experience too.

Which license?

Hi Johannes,

could you please add a license to this repo?

Best
Arne

Frontend and API exit with status code 2

Hi! First of all, thank you for containerizing all the spacy models. It's been very helpful for us.

I am trying to run the container on Google Cloud Run. Basically Cloud Run prefers listening on the port 8080 and as the container is using nginx i have replaced the default.conf file to make nginx expose port 8080 as given on the link - https://stackoverflow.com/questions/47364019/how-to-change-the-port-of-nginx-when-using-with-docker#:~:text=If%20you%20want%20to%20change,conf%20file%20inside%20the%20container.&text=navigating%20to%20localhost%3A3333%20in,to%20include%20the%20default%20nginx.

** Dockerfile **

FROM jgontrum/spacyapi:base_v2
RUN pip install wheel
ENV languages "en_core_web_lg"
RUN cd /app && env/bin/download_models
COPY default.conf /etc/nginx/conf.d/
COPY nginx.conf /etc/nginx/
EXPOSE 8080

The problem is that it gives a 502 error on hitting any api. Here are the logs:

/usr/lib/python2.7/dist-packages/supervisor/options.py:461: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
2020-06-25 22:20:43,107 CRIT Supervisor is running as root. Privileges were not dropped because no user is specified in the config file. If you intend to run as root, youcan set user=root in the config file to avoid this message.
2020-06-25 22:20:43,107 INFO Included extra file "/etc/supervisor/conf.d/supervisor.conf" during parsing
2020-06-25 22:20:43,132 INFO RPC interface 'supervisor' initialized
2020-06-25 22:20:43,132 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2020-06-25 22:20:43,133 INFO supervisord started with pid 7
2020-06-25 22:20:44,136 INFO spawned: 'nginx' with pid 10
2020-06-25 22:20:44,142 INFO spawned: 'api' with pid 11
2020-06-25 22:20:44,145 INFO spawned: 'frontend' with pid 12
2020-06-25 22:20:45,333 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-06-25 22:20:45,334 INFO success: api entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-06-25 22:20:45,334 INFO success: frontend entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-06-25 22:20:49,475 INFO exited: frontend (exit status 2; expected)
2020-06-25 22:21:34,723 INFO exited: api (exit status 2; expected)

I can't figure out what is the problem here. Any help is appreciated.

The docker documentation is incorrect and confusing

First, I love how you wrapped up spacy in a docker container.

But the docker hub page is out of date
https://hub.docker.com/r/jgontrum/spacyapi/

  1. "Updated to spaCy 1.2.0" - In github the version is 2.0.16. There is no mentioning of V2
  2. Usage of port 5000 -
    "docker run --name spacyapi -d -p 127.0.0.1:5000:5000 jgontrum/spacyapi:en"
    &
    "curl http://localhost:5000/api --header 'content-type: application/json' --data '{text: "This is a text that I want to be analyzed."}' -X POST"

But in the new docker image it works on port 80 (internally):
"docker run -p "127.0.0.1:8080:80" jgontrum/spacyapi:en_v2"
&
"curl -s localhost:8000/dep -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'"

(btw: even in the github page, the usage of port 8000 is confusing to the unsuspecting user, since the run command maps uses external port 8080)

  1. API has completely change

Please update the docker hub page to reflect the same info as the github one

Spacy 2.3.0/3.x support

Hi,

my company is currently doing some lightweight text processing using spacy, it is a really great library for NLP tasks :-) .
We would like to integrate it as a standalone service in our application, using your dockerized application seems to be the right solution for us.
Do you have a time estimate when spacy 2.3.0 (or spacy 3.x) will be available as a docker image?

Best
Alexander

Is it possible to train data?

Is it possible to use the spacy-api-docker image also to train data? For example I want to send training data for custom entities like:

TRAIN_DATA = [
    (
        "Horses are too tall and they pretend to care about your feelings",
        {"entities": [(0, 6, MY_ANIMAL)]},
    ),
    ("Do they bite?", {"entities": []}),
    (
        "horses are too tall and they pretend to care about your feelings",
        {"entities": [(0, 6, MY_ANIMAL)]},
    ),
]

If you get an SSL error on data download, here's a fixed command

I've had to set up the model hosting stuff afresh, and there are a couple of teething problems, seemingly around SSL. If you have issues with the python -m spacy.en.download command, this should work:

sputnik --name spacy --repository-url http://index.spacy.io install en==1.1.0

Thanks for publishing this! I've wanted a Docker container for a while, but haven't used Docker, so I never got around to setting it up.

"Schema construction failed" when getting schema on model "en_v2"

  1. Create file docker-compose.yml:
version: "2"

services:
  spacyapi:
    image: jgontrum/spacyapi:en_v2
    ports:
      - "127.0.0.1:8080:80"
    restart: always
  1. Execute docker-compose up
  2. Visit GET /en/schema

Result:

{
    "title": "Schema construction failed",
    "description": "'NoneType' object is not subscriptable"
}

ISSUE: Docker image starts and then stops after a few seconds

I followed the instructions and pulled the docker image then ran the container

The container starts to run then stops after a few seconds
The only message visible on the container logs is: "Killed"

Other containers are working just fine.

Running on Windows 7 pro, 64 bit, Oracle VM Virtual Box 5.1.22

Docker version:
Client:
Version: 17.05.0-ce
API version: 1.29
Go version: go1.7.5
Git commit: 89658be
Built: Fri May 5 15:36:11 2017
OS/Arch: windows/amd64

Server:
Version: 17.05.0-ce
API version: 1.29 (minimum version 1.12)
Go version: go1.7.5
Git commit: 89658be
Built: Thu May 4 21:43:09 2017
OS/Arch: linux/amd64
Experimental: false

Some requests hang forever

I have deployed jgontrum/spacyapi:en_v2 in kubernets. however I am failing to execute post requests even from inside the container:
e.g. curl -s localhost:80/models returns ['en']
the same works for curl -s localhost:8000/models
but when I am trying to do: curl -s localhost:80/dep -d '{"text":"Pastafarians are smarter than people with Coca Colabottles."}' I am getting timeout from the nginx:

<title>504 Gateway Time-out</title>

504 Gateway Time-out


nginx/1.10.3 The same request sent to the 8000 hangs forever:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.