jgontrum / spacy-api-docker Goto Github PK

View Code? Open in Web Editor NEW

257.0 13.0 99.0 365 KB

spaCy REST API, wrapped in a Docker container.

Home Page: https://hub.docker.com/r/jgontrum/spacyapi/

License: MIT License

Python 37.37% Makefile 2.00% HTML 10.45% CSS 16.71% JavaScript 29.07% Shell 0.59% Dockerfile 3.80%

spacy docker natural-language-processing parsing restful-api microservice

spacy-api-docker's Introduction

spaCy API Docker

Ready-to-use Docker images for the spaCy NLP library.

spaCy API Docker is being sponsored by the following tool; please help to support us by taking a look and signing up to a free trial

Features

Use the awesome spaCy NLP framework with other programming languages.
Better scaling: One NLP - multiple services.
Build using the official spaCy REST services.
Dependency parsing visualisation with displaCy.
Docker images for English, German, Spanish, Italian, Dutch and French.
Automated builds to stay up to date with spaCy.
Current spaCy version: 2.0.16

Please note that this is a completely new API and is incompatible with the previous one. If you still need them, use jgontrum/spacyapi:en-legacy or jgontrum/spacyapi:de-legacy.

Documentation, API- and frontend code based upon spaCy REST services by Explosion AI.

Images

Image	Description
jgontrum/spacyapi:base_v2	Base image for spaCy 2.0, containing no language model
jgontrum/spacyapi:en_v2	English language model, spaCy 2.0
jgontrum/spacyapi:de_v2	German language model, spaCy 2.0
jgontrum/spacyapi:es_v2	Spanish language model, spaCy 2.0
jgontrum/spacyapi:fr_v2	French language model, spaCy 2.0
jgontrum/spacyapi:pt_v2	Portuguese language model, spaCy 2.0
jgontrum/spacyapi:it_v2	Italian language model, spaCy 2.0
jgontrum/spacyapi:nl_v2	Dutch language model, spaCy 2.0
jgontrum/spacyapi:all_v2	Contains EN, DE, ES, PT, NL, IT and FR language models, spaCy 2.0
OLD RELEASES
jgontrum/spacyapi:base	Base image, containing no language model
jgontrum/spacyapi:latest	English language model
jgontrum/spacyapi:en	English language model
jgontrum/spacyapi:de	German language model
jgontrum/spacyapi:es	Spanish language model
jgontrum/spacyapi:fr	French language model
jgontrum/spacyapi:all	Contains EN, DE, ES and FR language models
jgontrum/spacyapi:en-legacy	Old API with English model
jgontrum/spacyapi:de-legacy	Old API with German model

Usage

docker run -p "127.0.0.1:8080:80" jgontrum/spacyapi:en_v2

All models are loaded at start up time. Depending on the model size and server performance, this can take a few minutes.

The displaCy frontend is available at /ui.

Docker Compose

version: '2'

services:
  spacyapi:
    image: jgontrum/spacyapi:en_v2
    ports:
      - "127.0.0.1:8080:80"
    restart: always

Running Tests

In order to run unit tests locally pytest is included.

docker run -it jgontrum/spacyapi:en_v2 app/env/bin/pytest app/displacy_service_tests

Special Cases

The API includes rudimentary support for specifying special cases for your deployment. Currently only basic special cases are supported; for example, in the spaCy parlance:

tokenizer.add_special_case("isn't", [{ORTH: "isn't"}])

They can be supplied in an environment variable corresponding to the desired language model. For example, en_special_cases or en_core_web_lg_special_cases. They are configured as a single comma-delimited string, such as "isn't,doesn't,won't".

Use the following syntax to specify basic special case rules, such as for preserving contractions:

docker run -p "127.0.0.1:8080:80" -e en_special_cases="isn't,doesn't" jgontrum/spacyapi:en_v2

You can also configure this in a .env file if using docker-compose as above.

REST API Documentation

`GET` `/ui/`

displaCy frontend is available here.

`POST` `/dep`

Example request:

{
  "text": "They ate the pizza with anchovies",
  "model": "en",
  "collapse_punctuation": 0,
  "collapse_phrases": 1
}

Name	Type	Description
`text`	string	text to be parsed
`model`	string	identifier string for a model installed on the server
`collapse_punctuation`	boolean	Merge punctuation onto the preceding token?
`collapse_phrases`	boolean	Merge noun chunks and named entities into single tokens?

Example request using the Python Requests library:

import json
import requests

url = "http://localhost:8000/dep"
message_text = "They ate the pizza with anchovies"
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

{
  "arcs": [
    { "dir": "left", "start": 0, "end": 1, "label": "nsubj" },
    { "dir": "right", "start": 1, "end": 2, "label": "dobj" },
    { "dir": "right", "start": 1, "end": 3, "label": "prep" },
    { "dir": "right", "start": 3, "end": 4, "label": "pobj" },
    { "dir": "left", "start": 2, "end": 3, "label": "prep" }
  ],
  "words": [
    { "tag": "PRP", "text": "They" },
    { "tag": "VBD", "text": "ate" },
    { "tag": "NN", "text": "the pizza" },
    { "tag": "IN", "text": "with" },
    { "tag": "NNS", "text": "anchovies" }
  ]
}

Name	Type	Description
`arcs`	array	data to generate the arrows
`dir`	string	direction of arrow (`"left"` or `"right"`)
`start`	integer	offset of word the arrow starts on
`end`	integer	offset of word the arrow ends on
`label`	string	dependency label
`words`	array	data to generate the words
`tag`	string	part-of-speech tag
`text`	string	token

Curl command:

curl -s localhost:8000/dep -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'

{
  "arcs": [
    {
      "dir": "left",
      "end": 1,
      "label": "nsubj",
      "start": 0
    },
    {
      "dir": "right",
      "end": 2,
      "label": "acomp",
      "start": 1
    },
    {
      "dir": "right",
      "end": 3,
      "label": "prep",
      "start": 2
    },
    {
      "dir": "right",
      "end": 4,
      "label": "pobj",
      "start": 3
    },
    {
      "dir": "right",
      "end": 5,
      "label": "prep",
      "start": 4
    },
    {
      "dir": "right",
      "end": 6,
      "label": "pobj",
      "start": 5
    }
  ],
  "words": [
    {
      "tag": "NNPS",
      "text": "Pastafarians"
    },
    {
      "tag": "VBP",
      "text": "are"
    },
    {
      "tag": "JJR",
      "text": "smarter"
    },
    {
      "tag": "IN",
      "text": "than"
    },
    {
      "tag": "NNS",
      "text": "people"
    },
    {
      "tag": "IN",
      "text": "with"
    },
    {
      "tag": "NNS",
      "text": "Coca Cola bottles."
    }
  ]
}

`POST` `/ent`

Example request:

{
  "text": "When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously.",
  "model": "en"
}

Name	Type	Description
`text`	string	text to be parsed
`model`	string	identifier string for a model installed on the server

Example request using the Python Requests library:

import json
import requests

url = "http://localhost:8000/ent"
message_text = "When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously."
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

[
  { "end": 20, "start": 5, "type": "PERSON" },
  { "end": 67, "start": 61, "type": "ORG" },
  { "end": 75, "start": 71, "type": "DATE" }
]

Name	Type	Description
`end`	integer	character offset the entity ends after
`start`	integer	character offset the entity starts on
`type`	string	entity type

curl -s localhost:8000/ent -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'

[
  {
    "end": 12,
    "start": 0,
    "text": "Pastafarians",
    "type": "NORP"
  },
  {
    "end": 51,
    "start": 42,
    "text": "Coca Cola",
    "type": "ORG"
  }
]

`POST` `/sents`

Example request:

{
  "text": "In 2012 I was a mediocre developer. But today I am at least a bit better.",
  "model": "en"
}

Name	Type	Description
`text`	string	text to be parsed
`model`	string	identifier string for a model installed on the server

Example request using the Python Requests library:

import json
import requests

url = "http://localhost:8000/sents"
message_text = "In 2012 I was a mediocre developer. But today I am at least a bit better."
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

["In 2012 I was a mediocre developer.", "But today I am at least a bit better."]

`POST` `/sents_dep`

Combination of /sents and /dep, returns sentences and dependency parses

Example request:

{
  "text": "In 2012 I was a mediocre developer. But today I am at least a bit better.",
  "model": "en"
}

Name	Type	Description
`text`	string	text to be parsed
`model`	string	identifier string for a model installed on the server

Example request using the Python Requests library:

import json
import requests

url = "http://localhost:8000/sents_dep"
message_text = "In 2012 I was a mediocre developer. But today I am at least a bit better."
headers = {'content-type': 'application/json'}
d = {'text': message_text, 'model': 'en'}

response = requests.post(url, data=json.dumps(d), headers=headers)
r = response.json()

Example response:

[
  {
    "sentence": "In 2012 I was a mediocre developer.",
    "dep_parse": {
      "arcs": [
        {
          "dir": "left",
          "end": 3,
          "label": "prep",
          "start": 0,
          "text": "In"
        },
        {
          "dir": "right",
          "end": 1,
          "label": "pobj",
          "start": 0,
          "text": "2012"
        },
        {
          "dir": "left",
          "end": 3,
          "label": "nsubj",
          "start": 2,
          "text": "I"
        },
        {
          "dir": "left",
          "end": 6,
          "label": "det",
          "start": 4,
          "text": "a"
        },
        {
          "dir": "left",
          "end": 6,
          "label": "amod",
          "start": 5,
          "text": "mediocre"
        },
        {
          "dir": "right",
          "end": 6,
          "label": "attr",
          "start": 3,
          "text": "developer"
        },
        {
          "dir": "right",
          "end": 7,
          "label": "punct",
          "start": 3,
          "text": "."
        }
      ],
      "words": [
        {
          "tag": "IN",
          "text": "In"
        },
        {
          "tag": "CD",
          "text": "2012"
        },
        {
          "tag": "PRP",
          "text": "I"
        },
        {
          "tag": "VBD",
          "text": "was"
        },
        {
          "tag": "DT",
          "text": "a"
        },
        {
          "tag": "JJ",
          "text": "mediocre"
        },
        {
          "tag": "NN",
          "text": "developer"
        },
        {
          "tag": ".",
          "text": "."
        }
      ]
    }
  },
  {
    "sentence": "But today I am at least a bit better.",
    "dep_parse": {
      "arcs": [
        {
          "dir": "left",
          "end": 11,
          "label": "cc",
          "start": 8,
          "text": "But"
        },
        {
          "dir": "left",
          "end": 11,
          "label": "npadvmod",
          "start": 9,
          "text": "today"
        },
        {
          "dir": "left",
          "end": 11,
          "label": "nsubj",
          "start": 10,
          "text": "I"
        },
        {
          "dir": "left",
          "end": 13,
          "label": "advmod",
          "start": 12,
          "text": "at"
        },
        {
          "dir": "left",
          "end": 15,
          "label": "advmod",
          "start": 13,
          "text": "least"
        },
        {
          "dir": "left",
          "end": 15,
          "label": "det",
          "start": 14,
          "text": "a"
        },
        {
          "dir": "left",
          "end": 16,
          "label": "npadvmod",
          "start": 15,
          "text": "bit"
        },
        {
          "dir": "right",
          "end": 16,
          "label": "acomp",
          "start": 11,
          "text": "better"
        },
        {
          "dir": "right",
          "end": 17,
          "label": "punct",
          "start": 11,
          "text": "."
        }
      ],
      "words": [
        {
          "tag": "CC",
          "text": "But"
        },
        {
          "tag": "NN",
          "text": "today"
        },
        {
          "tag": "PRP",
          "text": "I"
        },
        {
          "tag": "VBP",
          "text": "am"
        },
        {
          "tag": "IN",
          "text": "at"
        },
        {
          "tag": "JJS",
          "text": "least"
        },
        {
          "tag": "DT",
          "text": "a"
        },
        {
          "tag": "NN",
          "text": "bit"
        },
        {
          "tag": "RBR",
          "text": "better"
        },
        {
          "tag": ".",
          "text": "."
        }
      ]
    }
  }
]

`GET` `/models`

List the names of models installed on the server.

Example request:

GET /models

Example response:

["en", "de"]

`GET` `/{model}/schema`

Example request:

GET /en/schema

Name	Type	Description
`model`	string	identifier string for a model installed on the server

Example response:

{
  "dep_types": ["ROOT", "nsubj"],
  "ent_types": ["PERSON", "LOC", "ORG"],
  "pos_types": ["NN", "VBZ", "SP"]
}

`GET` `/version`

Show the used spaCy version.

Example request:

GET /version

Example response:

{
  "spacy": "2.2.4"
}

spacy-api-docker's People

Contributors

Stargazers

Watchers

Forkers

iuhelper neekolas thecolorblue townie dtsukiyama enod tcrossland hanhanwu aculich skerit alsayedgamal leodambrosi jeekim bziobnic bastienbot cherifsy rd4704 moooji lulzzz osandvik rhofvendahl snehha wandonye jshudzina afshinm cxz jpsm94 xumx dissendahl rafaelleonhardt mtford90 balzot bhaskardivya-reflektion hunter-io saadali1996 beautifultango yetimeha falsedlah fengweijp nofishlikeian naxalpha oterrier matityahul ponderlabs cactis rathinavel09 femoratelli nitinthewiz areeves87 jtrees mouradk78 escaped aolney aiera-inc babyhuey mjfox3 databill86 estebanmate do-wa qualicen mardatasci nfrohmueller josepaiva94 epugh papadp poldham cambalab langley mferlay stormtv georgi-petkov carlosvsqz bbieniek matiaspierri maksymmanziuk ammar257ammar sarweshs thomas-hervey gemelgb elijahahianyo mandpd colombo-group rsluchevskiy lisannewiengarten ekakit mzcu jinler

spacy-api-docker's Issues

Errors while running jgontrum/spacyapi:en_v2

Thank you for this project. Have been using v1 with much success.

Tried running jgontrum/spacyapi:en_v2 on both a MacBook Pro 10.14 & a fresh Ubuntu setup & got the same error output:

2018-11-03 17:59:52,943 INFO Included extra file "/etc/supervisor/conf.d/supervisor.conf" during parsing
2018-11-03 17:59:52,955 INFO RPC interface 'supervisor' initialized
2018-11-03 17:59:52,955 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2018-11-03 17:59:52,955 INFO supervisord started with pid 7
2018-11-03 17:59:53,958 INFO spawned: 'nginx' with pid 10
2018-11-03 17:59:53,960 INFO spawned: 'api' with pid 11
2018-11-03 17:59:53,962 INFO spawned: 'frontend' with pid 12
2018-11-03 17:59:54,573 INFO exited: frontend (exit status 2; not expected)
2018-11-03 17:59:55,574 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-11-03 17:59:55,575 INFO success: api entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2018-11-03 17:59:55,577 INFO spawned: 'frontend' with pid 27
2018-11-03 17:59:56,145 INFO exited: frontend (exit status 2; not expected)
2018-11-03 17:59:58,150 INFO spawned: 'frontend' with pid 35
2018-11-03 17:59:58,782 INFO exited: frontend (exit status 2; not expected)
2018-11-03 18:00:01,788 INFO spawned: 'frontend' with pid 43
2018-11-03 18:00:02,495 INFO exited: frontend (exit status 2; not expected)
2018-11-03 18:00:02,496 INFO gave up: frontend entered FATAL state, too many start retries too quickly

Please let me know what other information would be helpful.

Greek language support

Hi,

Is it planned to add greek language support for jgontrum/spacyapi:all_v2 ?

Best
Alexander

French model not working

docker run -p "127.0.0.1:8080:80" jgontrum/spacyapi:fr

http://localhost:8080/ui

UI loads correctly, however any requests from the sentence entry simply do not return. The loading icon spins and nothing else happens.

I have tried en and en_v2. Both work as they should.

When testing this no other containers were running.

edit: curl commands fail as well.

curl -s localhost:80/dep -d '{"text":"Bonjour Justin Trudeau.", "model":"fr"}'

{"title":"Dependency parsing failed","description":"'NoneType' object is not callable"}

Use different ports behind nginx

Hi, i'm trying to run this on google cloud run which requires to run on port 8080. Right now this port is used internally so it doesn't run when trying to use PORT=8080. Would it be possible to move the internal ports into a less frequently used port range? (like e.g. 32543)

jgontrum/spacyapi:en_v2 | Dependency parsing failed: Can't find model 'en'

For some reason, the following:

{
  'model': 'en', 
  'collapse_phrases': True, 
  'text': 'I paid him $100,- for nothing!', 
  'collapse_punctuation': False
}

Gives me

Dependency parsing failed: Can't find model 'en'

from the server.

I am running

docker run -p "127.0.0.1:8082:80" jgontrum/spacyapi:en_v2

Do I have to set another model name? I already tried en_v2.

Example curl doesn't work

 $ curl http://localhost:5000/api --header 'content-type: application/json' --data '{text: "This is a text that I want to be analyzed."}' -X POST
{"message": "The browser (or proxy) sent a request that this server could not understand."}

I assume you meant something like this instead, which does work:

 $ curl http://localhost:5000/api --header 'Content-Type: application/json' --data '{"text": "This is a text that I want to be analyzed."}' -X POST
{"numOfSentences": 1, "lang": "en", "performance": [0.18333840370178223], "sentences": [[{"pos": "DET", "email": false, "ner": "", "stop": true, "token": ...

The REST API returns JSON payload with Content-Type "text/string"

This makes some clients need to bend over backwards to process the results. The fix for this seems to be fairly easy, just change the HUG annotations, for example:
@hug.post("/dep", output=hug.output_format.json)
This would be, of course, a change in the spacy-services repository, but I can't open an issue there.

POS Tree

Would it be possible to return the POS TREE next to the token?

Similarity route?

Maybe I missed it but it looks like there is currently no implementation for spaCy's semantic similarity method.

Is this related to the fact that most of the models used are the "small" version, which generally do not perform similarity calculations well?

Displacy is no longer maintaned since 2.0.0

Displacy is no longer maintaned since 2.0.0 and is now included in spacy itself. This page explains how to use it

https://spacy.io/usage/visualizers

How can I download fr_core_news_md?

I tried the approaches described on http://github.com/jgontrum/spacy-api-docker/issues/28
but It didn't work.

Update to spaCy 1.9

Can you update to latest version.

Upgrade to Falcon 3 for CORS

Hello,
I needed to authorize CORS on the Docker API, and generate an updated Docker image.
To solve it, I upgraded the falcon dependency to 3.0.1, in order to use simple CORS configuration here on line 1 :

APP = falcon.API(cors_enable=True)
APP.add_route('/dep', DepResource())
APP.add_route('/ent', EntResource())
APP.add_route('/sents', SentsResources())
APP.add_route('/sents_dep', SentsDepResources())
APP.add_route('/{model_name}/schema', SchemaResource())
APP.add_route('/models', ModelsResource())
APP.add_route('/version', VersionResource())

If you want me to integrate these changes in your repo, keep me posted ;)

No output for curl command

Hi All,

I cloned the repo and ran the cli:

Prabuddhs-MacBook-Air:spacy-api-docker pg$ docker run -p "127.0.0.1:8080:80" jgontrum/spacyapi:en
/usr/lib/python2.7/dist-packages/supervisor/options.py:296: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
2017-10-20 15:46:48,043 CRIT Supervisor running as root (no user in config file)
2017-10-20 15:46:48,043 WARN Included extra file "/etc/supervisor/conf.d/supervisor.conf" during parsing
2017-10-20 15:46:48,070 INFO RPC interface 'supervisor' initialized
2017-10-20 15:46:48,070 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2017-10-20 15:46:48,072 INFO supervisord started with pid 1
2017-10-20 15:46:49,081 INFO spawned: 'nginx' with pid 7
2017-10-20 15:46:49,087 INFO spawned: 'api' with pid 8
2017-10-20 15:46:49,093 INFO spawned: 'frontend' with pid 9
2017-10-20 15:46:50,149 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-10-20 15:46:50,150 INFO success: api entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2017-10-20 15:46:50,150 INFO success: frontend entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

when I run the curl command:
curl -s localhost:8000/ent -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'

I dont get any output.

How can I download en_core_web_lg?

Hey Johannes!

Thanks for this awesome work here!

I was testing the docker image for English out and noticed that it by default downloads en_core_web_sm.

I tried to get it to download en_core_web_lg through various methods, including by changing the code as follows and passing the ENV variable languages to be en_core_web_lg-2.1.0 -
https://github.com/nitinthewiz/spacy-api-docker/blob/master/displacy_service/scripts/download.py#L10

But this process fails, because the API server doesn't start, as it complains about spacy.load("en_core_web_lg-2.1.0") failing.

spacy.load seems to expect "en" only, and doesn't work with anything else.

Could you tell me if there's a way to convince the docker image to run with en_core_web_lg?

Thanks a lot!

/dep

Hello: first, thank you for making this available as a docker image!

I pulled en_v2, and /ui works but not /dep - I pulled en (v1) and /ui and /dep works.
I don't know if that is documented anywhere?

-David

Adding a new entry point for POS tagger only output

Hi,
I'm thinking in adding a new entry point /tag to retrieve a result of the POS tagging of a document with detailed output for each token.
My basic idea is to accept a json request with the following body:

{
text : "text",
model: "model",
include_sentences : true|false, #include a sentence level or not in the output
attr_filter : [ ] #list of token attributes to include in the output, like ["lemma", "pos", ... "is_stop", ...]
}

The output could be a list of tokens like:
[ { text : "text", start : 111, end : 222, lemma : "lemma", ... } , {}, .. , ]

with eventually an additionnal sentence level like:

[
 { text : "sentence text", start : 0, end : 100000, tokens : [ {}, ... {}] },
...
]

What do you think ?
I need something like that to use spacy from a java program

Best regards

Olivier

Lemme of words

Hi, How to get the lemme of words with the API ?

Custom model?

New to Spacey. Can you describe how one might add their pre-trained model to the docker package?

text classification rest api

Hi, it looks like this spacy api is close to what I need, but I want to create a docker spacy api, pre trained, to analyse a text string and categorize it, is this something I can do with this projects? If not any advice? Thanks

Update to Spacy 2.2

Hi,
Could you please upgrade the spacy-api-docker to support the latest version of spacy - v.2.2
https://explosion.ai/blog/spacy-v2-2

There are many improvements there, some of them quite major, such as dealing with lower case input data.

Thanks

problem running legacy version

So I am running your legacy one (since according to the docs has more features, like LEMMA etc), but I am getting stuck:
docker run --env PORT=5050 jgontrum/spacyapi

But I can't hit it with curl like the docs here say

I tried without the port number, and with (as above) and I just get failed to connect to port

What am I missing?

Sentence Boundary Detection?

First, thank you for creating this REST API capabile Docker container for Spacy. I have set up a Spacy server using it and I am successfully able to get parse trees for sentences I submit over the REST API.

I would like to able to do sentence boundary detection too. Is there a way to use the Docker container to do that? If not, how hard would it be for me to enhance the REST API to be able to do that too? I'm an experienced C/C++ and Javascript programmer of many years, with about a year of Python experience too.

Which license?

Hi Johannes,

could you please add a license to this repo?

Best
Arne

Frontend and API exit with status code 2

Hi! First of all, thank you for containerizing all the spacy models. It's been very helpful for us.

I am trying to run the container on Google Cloud Run. Basically Cloud Run prefers listening on the port 8080 and as the container is using nginx i have replaced the default.conf file to make nginx expose port 8080 as given on the link - https://stackoverflow.com/questions/47364019/how-to-change-the-port-of-nginx-when-using-with-docker#:~:text=If%20you%20want%20to%20change,conf%20file%20inside%20the%20container.&text=navigating%20to%20localhost%3A3333%20in,to%20include%20the%20default%20nginx.

** Dockerfile **

FROM jgontrum/spacyapi:base_v2
RUN pip install wheel
ENV languages "en_core_web_lg"
RUN cd /app && env/bin/download_models
COPY default.conf /etc/nginx/conf.d/
COPY nginx.conf /etc/nginx/
EXPOSE 8080

The problem is that it gives a 502 error on hitting any api. Here are the logs:

/usr/lib/python2.7/dist-packages/supervisor/options.py:461: UserWarning: Supervisord is running as root and it is searching for its configuration file in default locations (including its current working directory); you probably want to specify a "-c" argument specifying an absolute path to a configuration file for improved security.
'Supervisord is running as root and it is searching '
2020-06-25 22:20:43,107 CRIT Supervisor is running as root. Privileges were not dropped because no user is specified in the config file. If you intend to run as root, youcan set user=root in the config file to avoid this message.
2020-06-25 22:20:43,107 INFO Included extra file "/etc/supervisor/conf.d/supervisor.conf" during parsing
2020-06-25 22:20:43,132 INFO RPC interface 'supervisor' initialized
2020-06-25 22:20:43,132 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2020-06-25 22:20:43,133 INFO supervisord started with pid 7
2020-06-25 22:20:44,136 INFO spawned: 'nginx' with pid 10
2020-06-25 22:20:44,142 INFO spawned: 'api' with pid 11
2020-06-25 22:20:44,145 INFO spawned: 'frontend' with pid 12
2020-06-25 22:20:45,333 INFO success: nginx entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-06-25 22:20:45,334 INFO success: api entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-06-25 22:20:45,334 INFO success: frontend entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2020-06-25 22:20:49,475 INFO exited: frontend (exit status 2; expected)
2020-06-25 22:21:34,723 INFO exited: api (exit status 2; expected)

I can't figure out what is the problem here. Any help is appreciated.

The docker documentation is incorrect and confusing

First, I love how you wrapped up spacy in a docker container.

But the docker hub page is out of date
https://hub.docker.com/r/jgontrum/spacyapi/

"Updated to spaCy 1.2.0" - In github the version is 2.0.16. There is no mentioning of V2
Usage of port 5000 -
"docker run --name spacyapi -d -p 127.0.0.1:5000:5000 jgontrum/spacyapi:en"
&
"curl http://localhost:5000/api --header 'content-type: application/json' --data '{text: "This is a text that I want to be analyzed."}' -X POST"

But in the new docker image it works on port 80 (internally):
"docker run -p "127.0.0.1:8080:80" jgontrum/spacyapi:en_v2"
&
"curl -s localhost:8000/dep -d '{"text":"Pastafarians are smarter than people with Coca Cola bottles.", "model":"en"}'"

(btw: even in the github page, the usage of port 8000 is confusing to the unsuspecting user, since the run command maps uses external port 8080)

API has completely change

Please update the docker hub page to reflect the same info as the github one

Spacy 2.3.0/3.x support

Hi,

my company is currently doing some lightweight text processing using spacy, it is a really great library for NLP tasks :-) .
We would like to integrate it as a standalone service in our application, using your dockerized application seems to be the right solution for us.
Do you have a time estimate when spacy 2.3.0 (or spacy 3.x) will be available as a docker image?

Best
Alexander

Request from another container

I‘m getting an error (Cross origin) when I am doing a request from another container. How can I enable cors?

Is it possible to train data?

Is it possible to use the spacy-api-docker image also to train data? For example I want to send training data for custom entities like:

TRAIN_DATA = [
    (
        "Horses are too tall and they pretend to care about your feelings",
        {"entities": [(0, 6, MY_ANIMAL)]},
    ),
    ("Do they bite?", {"entities": []}),
    (
        "horses are too tall and they pretend to care about your feelings",
        {"entities": [(0, 6, MY_ANIMAL)]},
    ),
]

DEFECT: Latest docker container is broken - Displacy UI doesn't work

I had a fully functional spacy docker running from 6 months ago. After downloading the new image today the displacy UI shows up but when entering a sentence it spins forever and doesn't return results.

If you get an SSL error on data download, here's a fixed command

I've had to set up the model hosting stuff afresh, and there are a couple of teething problems, seemingly around SSL. If you have issues with the python -m spacy.en.download command, this should work:

sputnik --name spacy --repository-url http://index.spacy.io install en==1.1.0

Thanks for publishing this! I've wanted a Docker container for a while, but haven't used Docker, so I never got around to setting it up.

"Schema construction failed" when getting schema on model "en_v2"

Create file docker-compose.yml:

version: "2"

services:
  spacyapi:
    image: jgontrum/spacyapi:en_v2
    ports:
      - "127.0.0.1:8080:80"
    restart: always

Execute docker-compose up
Visit GET /en/schema

Result:

{
    "title": "Schema construction failed",
    "description": "'NoneType' object is not subscriptable"
}

ISSUE: Docker image starts and then stops after a few seconds

I followed the instructions and pulled the docker image then ran the container

The container starts to run then stops after a few seconds
The only message visible on the container logs is: "Killed"

Other containers are working just fine.

Running on Windows 7 pro, 64 bit, Oracle VM Virtual Box 5.1.22

Docker version:
Client:
Version: 17.05.0-ce
API version: 1.29
Go version: go1.7.5
Git commit: 89658be
Built: Fri May 5 15:36:11 2017
OS/Arch: windows/amd64

Server:
Version: 17.05.0-ce
API version: 1.29 (minimum version 1.12)
Go version: go1.7.5
Git commit: 89658be
Built: Thu May 4 21:43:09 2017
OS/Arch: linux/amd64
Experimental: false

Some requests hang forever

I have deployed jgontrum/spacyapi:en_v2 in kubernets. however I am failing to execute post requests even from inside the container:
e.g. curl -s localhost:80/models returns ['en']
the same works for curl -s localhost:8000/models
but when I am trying to do: curl -s localhost:80/dep -d '{"text":"Pastafarians are smarter than people with Coca Colabottles."}' I am getting timeout from the nginx:

<title>504 Gateway Time-out</title>

504 Gateway Time-out

nginx/1.10.3 The same request sent to the 8000 hangs forever:

jgontrum / spacy-api-docker Goto Github PK

spacy-api-docker's Introduction

spaCy API Docker

Features

Images

Usage

Docker Compose

Running Tests

Special Cases

REST API Documentation

GET /ui/

POST /dep

POST /ent

POST /sents

POST /sents_dep

GET /models

GET /{model}/schema

GET /version

spacy-api-docker's People

Contributors

Stargazers

Watchers

Forkers

spacy-api-docker's Issues

504 Gateway Time-out

Recommend Projects

Recommend Topics

Recommend Org

`GET` `/ui/`

`POST` `/dep`

`POST` `/ent`

`POST` `/sents`

`POST` `/sents_dep`

`GET` `/models`

`GET` `/{model}/schema`

`GET` `/version`