{"id":13753661,"url":"https://github.com/jgontrum/spacy-api-docker","last_synced_at":"2025-04-06T03:11:33.410Z","repository":{"id":49966251,"uuid":"65741668","full_name":"jgontrum/spacy-api-docker","owner":"jgontrum","description":"spaCy REST API, wrapped in a Docker container.","archived":false,"fork":false,"pushed_at":"2023-01-11T22:26:17.000Z","size":365,"stargazers_count":267,"open_issues_count":25,"forks_count":99,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-03-30T02:09:28.059Z","etag":null,"topics":["docker","microservice","natural-language-processing","parsing","restful-api","spacy"],"latest_commit_sha":null,"homepage":"https://hub.docker.com/r/jgontrum/spacyapi/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jgontrum.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-08-15T15:11:01.000Z","updated_at":"2025-03-22T13:11:54.000Z","dependencies_parsed_at":"2023-02-09T08:31:30.130Z","dependency_job_id":null,"html_url":"https://github.com/jgontrum/spacy-api-docker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jgontrum%2Fspacy-api-docker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jgontrum%2Fspacy-api-docker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jgontrum%2Fspacy-api-docker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jgontrum%2Fspacy-api-docker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jgontrum","download_url":"https://codeload.github.com/jgontrum/spacy-api-docker/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247427006,"owners_count":20937213,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["docker","microservice","natural-language-processing","parsing","restful-api","spacy"],"created_at":"2024-08-03T09:01:26.817Z","updated_at":"2025-04-06T03:11:33.390Z","avatar_url":"https://github.com/jgontrum.png","language":"Python","readme":"# spaCy API Docker\n\n**Ready-to-use Docker images for the [spaCy NLP library](https://github.com/explosion/spaCy).**\n\n---\n**[spaCy API Docker](https://github.com/jgontrum/spacy-api-docker) is being sponsored by the following tool; please help to support us by taking a look and signing up to a free trial**\n\n\n[\u003cimg src=\"https://images.gitads.io/spacy-api-docker\" alt=\"GitAds\"/\u003e](https://tracking.gitads.io/?repo=spacy-api-docker)\n---\n\n### Features\n\n- Use the awesome spaCy NLP framework with other programming languages.\n- Better scaling: One NLP - multiple services.\n- Build using the official [spaCy REST services](https://github.com/explosion/spacy-services).\n- Dependency parsing visualisation with [displaCy](https://demos.explosion.ai/displacy/).\n- Docker images for **English**, **German**, **Spanish**, **Italian**, **Dutch** and **French**.\n- Automated builds to stay up to date with spaCy.\n- Current spaCy version: 2.0.16\n\nPlease note that this is a completely new API and is incompatible with the previous one. If you still need them, use `jgontrum/spacyapi:en-legacy` or `jgontrum/spacyapi:de-legacy`.\n\n_Documentation, API- and frontend code based upon [spaCy REST services](https://github.com/explosion/spacy-services) by [Explosion AI](https://explosion.ai)._\n\n---\n\n## Images\n\n| Image                       | Description                                                       |\n| --------------------------- | ----------------------------------------------------------------- |\n| jgontrum/spacyapi:base_v2   | Base image for spaCy 2.0, containing no language model            |\n| jgontrum/spacyapi:en_v2     | English language model, spaCy 2.0                                 |\n| jgontrum/spacyapi:de_v2     | German language model, spaCy 2.0                                  |\n| jgontrum/spacyapi:es_v2     | Spanish language model, spaCy 2.0                                 |\n| jgontrum/spacyapi:fr_v2     | French language model, spaCy 2.0                                  |\n| jgontrum/spacyapi:pt_v2     | Portuguese language model, spaCy 2.0                              |\n| jgontrum/spacyapi:it_v2     | Italian language model, spaCy 2.0                                 |\n| jgontrum/spacyapi:nl_v2     | Dutch language model, spaCy 2.0                                   |\n| jgontrum/spacyapi:all_v2    | Contains EN, DE, ES, PT, NL, IT and FR language models, spaCy 2.0 |\n| _OLD RELEASES_              |                                                                   |\n| jgontrum/spacyapi:base      | Base image, containing no language model                          |\n| jgontrum/spacyapi:latest    | English language model                                            |\n| jgontrum/spacyapi:en        | English language model                                            |\n| jgontrum/spacyapi:de        | German language model                                             |\n| jgontrum/spacyapi:es        | Spanish language model                                            |\n| jgontrum/spacyapi:fr        | French language model                                             |\n| jgontrum/spacyapi:all       | Contains EN, DE, ES and FR language models                        |\n| jgontrum/spacyapi:en-legacy | Old API with English model                                        |\n| jgontrum/spacyapi:de-legacy | Old API with German model                                         |\n\n---\n\n## Usage\n\n`docker run -p \"127.0.0.1:8080:80\" jgontrum/spacyapi:en_v2`\n\nAll models are loaded at start up time. Depending on the model size and server\nperformance, this can take a few minutes.\n\nThe displaCy frontend is available at `/ui`.\n\n### Docker Compose\n\n```json\nversion: '2'\n\nservices:\n  spacyapi:\n    image: jgontrum/spacyapi:en_v2\n    ports:\n      - \"127.0.0.1:8080:80\"\n    restart: always\n\n```\n\n### Running Tests\n\nIn order to run unit tests locally `pytest` is included.\n\n`docker run -it jgontrum/spacyapi:en_v2 app/env/bin/pytest app/displacy_service_tests`\n\n### Special Cases\n\nThe API includes rudimentary support for specifying [special cases](https://spacy.io/usage/linguistic-features#special-cases)\nfor your deployment. Currently only basic special cases are supported; for example, in the spaCy parlance:\n\n```python\ntokenizer.add_special_case(\"isn't\", [{ORTH: \"isn't\"}])\n```\n\nThey can be supplied in an environment variable corresponding to the desired language model. For example, `en_special_cases`\nor `en_core_web_lg_special_cases`. They are configured as a single comma-delimited string, such as `\"isn't,doesn't,won't\"`.\n\nUse the following syntax to specify basic special case rules, such as for preserving contractions:\n\n`docker run -p \"127.0.0.1:8080:80\" -e en_special_cases=\"isn't,doesn't\" jgontrum/spacyapi:en_v2`\n\nYou can also configure this in a `.env` file if using `docker-compose` as above.\n\n---\n\n## REST API Documentation\n\n### `GET` `/ui/`\n\ndisplaCy frontend is available here.\n\n---\n\n### `POST` `/dep`\n\nExample request:\n\n```json\n{\n  \"text\": \"They ate the pizza with anchovies\",\n  \"model\": \"en\",\n  \"collapse_punctuation\": 0,\n  \"collapse_phrases\": 1\n}\n```\n\n| Name                   | Type    | Description                                              |\n| ---------------------- | ------- | -------------------------------------------------------- |\n| `text`                 | string  | text to be parsed                                        |\n| `model`                | string  | identifier string for a model installed on the server    |\n| `collapse_punctuation` | boolean | Merge punctuation onto the preceding token?              |\n| `collapse_phrases`     | boolean | Merge noun chunks and named entities into single tokens? |\n\nExample request using the Python [Requests library](http://docs.python-requests.org/en/master/):\n\n```python\nimport json\nimport requests\n\nurl = \"http://localhost:8000/dep\"\nmessage_text = \"They ate the pizza with anchovies\"\nheaders = {'content-type': 'application/json'}\nd = {'text': message_text, 'model': 'en'}\n\nresponse = requests.post(url, data=json.dumps(d), headers=headers)\nr = response.json()\n```\n\nExample response:\n\n```json\n{\n  \"arcs\": [\n    { \"dir\": \"left\", \"start\": 0, \"end\": 1, \"label\": \"nsubj\" },\n    { \"dir\": \"right\", \"start\": 1, \"end\": 2, \"label\": \"dobj\" },\n    { \"dir\": \"right\", \"start\": 1, \"end\": 3, \"label\": \"prep\" },\n    { \"dir\": \"right\", \"start\": 3, \"end\": 4, \"label\": \"pobj\" },\n    { \"dir\": \"left\", \"start\": 2, \"end\": 3, \"label\": \"prep\" }\n  ],\n  \"words\": [\n    { \"tag\": \"PRP\", \"text\": \"They\" },\n    { \"tag\": \"VBD\", \"text\": \"ate\" },\n    { \"tag\": \"NN\", \"text\": \"the pizza\" },\n    { \"tag\": \"IN\", \"text\": \"with\" },\n    { \"tag\": \"NNS\", \"text\": \"anchovies\" }\n  ]\n}\n```\n\n| Name    | Type    | Description                                |\n| ------- | ------- | ------------------------------------------ |\n| `arcs`  | array   | data to generate the arrows                |\n| `dir`   | string  | direction of arrow (`\"left\"` or `\"right\"`) |\n| `start` | integer | offset of word the arrow starts **on**     |\n| `end`   | integer | offset of word the arrow ends **on**       |\n| `label` | string  | dependency label                           |\n| `words` | array   | data to generate the words                 |\n| `tag`   | string  | part-of-speech tag                         |\n| `text`  | string  | token                                      |\n\n---\n\nCurl command:\n\n```\ncurl -s localhost:8000/dep -d '{\"text\":\"Pastafarians are smarter than people with Coca Cola bottles.\", \"model\":\"en\"}'\n```\n\n```json\n{\n  \"arcs\": [\n    {\n      \"dir\": \"left\",\n      \"end\": 1,\n      \"label\": \"nsubj\",\n      \"start\": 0\n    },\n    {\n      \"dir\": \"right\",\n      \"end\": 2,\n      \"label\": \"acomp\",\n      \"start\": 1\n    },\n    {\n      \"dir\": \"right\",\n      \"end\": 3,\n      \"label\": \"prep\",\n      \"start\": 2\n    },\n    {\n      \"dir\": \"right\",\n      \"end\": 4,\n      \"label\": \"pobj\",\n      \"start\": 3\n    },\n    {\n      \"dir\": \"right\",\n      \"end\": 5,\n      \"label\": \"prep\",\n      \"start\": 4\n    },\n    {\n      \"dir\": \"right\",\n      \"end\": 6,\n      \"label\": \"pobj\",\n      \"start\": 5\n    }\n  ],\n  \"words\": [\n    {\n      \"tag\": \"NNPS\",\n      \"text\": \"Pastafarians\"\n    },\n    {\n      \"tag\": \"VBP\",\n      \"text\": \"are\"\n    },\n    {\n      \"tag\": \"JJR\",\n      \"text\": \"smarter\"\n    },\n    {\n      \"tag\": \"IN\",\n      \"text\": \"than\"\n    },\n    {\n      \"tag\": \"NNS\",\n      \"text\": \"people\"\n    },\n    {\n      \"tag\": \"IN\",\n      \"text\": \"with\"\n    },\n    {\n      \"tag\": \"NNS\",\n      \"text\": \"Coca Cola bottles.\"\n    }\n  ]\n}\n```\n\n---\n\n### `POST` `/ent`\n\nExample request:\n\n```json\n{\n  \"text\": \"When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously.\",\n  \"model\": \"en\"\n}\n```\n\n| Name    | Type   | Description                                           |\n| ------- | ------ | ----------------------------------------------------- |\n| `text`  | string | text to be parsed                                     |\n| `model` | string | identifier string for a model installed on the server |\n\nExample request using the Python [Requests library](http://docs.python-requests.org/en/master/):\n\n```python\nimport json\nimport requests\n\nurl = \"http://localhost:8000/ent\"\nmessage_text = \"When Sebastian Thrun started working on self-driving cars at Google in 2007, few people outside of the company took him seriously.\"\nheaders = {'content-type': 'application/json'}\nd = {'text': message_text, 'model': 'en'}\n\nresponse = requests.post(url, data=json.dumps(d), headers=headers)\nr = response.json()\n```\n\nExample response:\n\n```json\n[\n  { \"end\": 20, \"start\": 5, \"type\": \"PERSON\" },\n  { \"end\": 67, \"start\": 61, \"type\": \"ORG\" },\n  { \"end\": 75, \"start\": 71, \"type\": \"DATE\" }\n]\n```\n\n| Name    | Type    | Description                                |\n| ------- | ------- | ------------------------------------------ |\n| `end`   | integer | character offset the entity ends **after** |\n| `start` | integer | character offset the entity starts **on**  |\n| `type`  | string  | entity type                                |\n\n```\ncurl -s localhost:8000/ent -d '{\"text\":\"Pastafarians are smarter than people with Coca Cola bottles.\", \"model\":\"en\"}'\n```\n\n```json\n[\n  {\n    \"end\": 12,\n    \"start\": 0,\n    \"text\": \"Pastafarians\",\n    \"type\": \"NORP\"\n  },\n  {\n    \"end\": 51,\n    \"start\": 42,\n    \"text\": \"Coca Cola\",\n    \"type\": \"ORG\"\n  }\n]\n```\n\n---\n\n### `POST` `/sents`\n\nExample request:\n\n```json\n{\n  \"text\": \"In 2012 I was a mediocre developer. But today I am at least a bit better.\",\n  \"model\": \"en\"\n}\n```\n\n| Name    | Type   | Description                                           |\n| ------- | ------ | ----------------------------------------------------- |\n| `text`  | string | text to be parsed                                     |\n| `model` | string | identifier string for a model installed on the server |\n\nExample request using the Python [Requests library](http://docs.python-requests.org/en/master/):\n\n```python\nimport json\nimport requests\n\nurl = \"http://localhost:8000/sents\"\nmessage_text = \"In 2012 I was a mediocre developer. But today I am at least a bit better.\"\nheaders = {'content-type': 'application/json'}\nd = {'text': message_text, 'model': 'en'}\n\nresponse = requests.post(url, data=json.dumps(d), headers=headers)\nr = response.json()\n```\n\nExample response:\n\n```json\n[\"In 2012 I was a mediocre developer.\", \"But today I am at least a bit better.\"]\n```\n\n---\n\n### `POST` `/sents_dep`\n\nCombination of `/sents` and `/dep`, returns sentences and dependency parses\n\nExample request:\n\n```json\n{\n  \"text\": \"In 2012 I was a mediocre developer. But today I am at least a bit better.\",\n  \"model\": \"en\"\n}\n```\n\n| Name    | Type   | Description                                           |\n| ------- | ------ | ----------------------------------------------------- |\n| `text`  | string | text to be parsed                                     |\n| `model` | string | identifier string for a model installed on the server |\n\nExample request using the Python [Requests library](http://docs.python-requests.org/en/master/):\n\n```python\nimport json\nimport requests\n\nurl = \"http://localhost:8000/sents_dep\"\nmessage_text = \"In 2012 I was a mediocre developer. But today I am at least a bit better.\"\nheaders = {'content-type': 'application/json'}\nd = {'text': message_text, 'model': 'en'}\n\nresponse = requests.post(url, data=json.dumps(d), headers=headers)\nr = response.json()\n```\n\nExample response:\n\n```json\n[\n  {\n    \"sentence\": \"In 2012 I was a mediocre developer.\",\n    \"dep_parse\": {\n      \"arcs\": [\n        {\n          \"dir\": \"left\",\n          \"end\": 3,\n          \"label\": \"prep\",\n          \"start\": 0,\n          \"text\": \"In\"\n        },\n        {\n          \"dir\": \"right\",\n          \"end\": 1,\n          \"label\": \"pobj\",\n          \"start\": 0,\n          \"text\": \"2012\"\n        },\n        {\n          \"dir\": \"left\",\n          \"end\": 3,\n          \"label\": \"nsubj\",\n          \"start\": 2,\n          \"text\": \"I\"\n        },\n        {\n          \"dir\": \"left\",\n          \"end\": 6,\n          \"label\": \"det\",\n          \"start\": 4,\n          \"text\": \"a\"\n        },\n        {\n          \"dir\": \"left\",\n          \"end\": 6,\n          \"label\": \"amod\",\n          \"start\": 5,\n          \"text\": \"mediocre\"\n        },\n        {\n          \"dir\": \"right\",\n          \"end\": 6,\n          \"label\": \"attr\",\n          \"start\": 3,\n          \"text\": \"developer\"\n        },\n        {\n          \"dir\": \"right\",\n          \"end\": 7,\n          \"label\": \"punct\",\n          \"start\": 3,\n          \"text\": \".\"\n        }\n      ],\n      \"words\": [\n        {\n          \"tag\": \"IN\",\n          \"text\": \"In\"\n        },\n        {\n          \"tag\": \"CD\",\n          \"text\": \"2012\"\n        },\n        {\n          \"tag\": \"PRP\",\n          \"text\": \"I\"\n        },\n        {\n          \"tag\": \"VBD\",\n          \"text\": \"was\"\n        },\n        {\n          \"tag\": \"DT\",\n          \"text\": \"a\"\n        },\n        {\n          \"tag\": \"JJ\",\n          \"text\": \"mediocre\"\n        },\n        {\n          \"tag\": \"NN\",\n          \"text\": \"developer\"\n        },\n        {\n          \"tag\": \".\",\n          \"text\": \".\"\n        }\n      ]\n    }\n  },\n  {\n    \"sentence\": \"But today I am at least a bit better.\",\n    \"dep_parse\": {\n      \"arcs\": [\n        {\n          \"dir\": \"left\",\n          \"end\": 11,\n          \"label\": \"cc\",\n          \"start\": 8,\n          \"text\": \"But\"\n        },\n        {\n          \"dir\": \"left\",\n          \"end\": 11,\n          \"label\": \"npadvmod\",\n          \"start\": 9,\n          \"text\": \"today\"\n        },\n        {\n          \"dir\": \"left\",\n          \"end\": 11,\n          \"label\": \"nsubj\",\n          \"start\": 10,\n          \"text\": \"I\"\n        },\n        {\n          \"dir\": \"left\",\n          \"end\": 13,\n          \"label\": \"advmod\",\n          \"start\": 12,\n          \"text\": \"at\"\n        },\n        {\n          \"dir\": \"left\",\n          \"end\": 15,\n          \"label\": \"advmod\",\n          \"start\": 13,\n          \"text\": \"least\"\n        },\n        {\n          \"dir\": \"left\",\n          \"end\": 15,\n          \"label\": \"det\",\n          \"start\": 14,\n          \"text\": \"a\"\n        },\n        {\n          \"dir\": \"left\",\n          \"end\": 16,\n          \"label\": \"npadvmod\",\n          \"start\": 15,\n          \"text\": \"bit\"\n        },\n        {\n          \"dir\": \"right\",\n          \"end\": 16,\n          \"label\": \"acomp\",\n          \"start\": 11,\n          \"text\": \"better\"\n        },\n        {\n          \"dir\": \"right\",\n          \"end\": 17,\n          \"label\": \"punct\",\n          \"start\": 11,\n          \"text\": \".\"\n        }\n      ],\n      \"words\": [\n        {\n          \"tag\": \"CC\",\n          \"text\": \"But\"\n        },\n        {\n          \"tag\": \"NN\",\n          \"text\": \"today\"\n        },\n        {\n          \"tag\": \"PRP\",\n          \"text\": \"I\"\n        },\n        {\n          \"tag\": \"VBP\",\n          \"text\": \"am\"\n        },\n        {\n          \"tag\": \"IN\",\n          \"text\": \"at\"\n        },\n        {\n          \"tag\": \"JJS\",\n          \"text\": \"least\"\n        },\n        {\n          \"tag\": \"DT\",\n          \"text\": \"a\"\n        },\n        {\n          \"tag\": \"NN\",\n          \"text\": \"bit\"\n        },\n        {\n          \"tag\": \"RBR\",\n          \"text\": \"better\"\n        },\n        {\n          \"tag\": \".\",\n          \"text\": \".\"\n        }\n      ]\n    }\n  }\n]\n```\n\n### `GET` `/models`\n\nList the names of models installed on the server.\n\nExample request:\n\n```\nGET /models\n```\n\nExample response:\n\n```json\n[\"en\", \"de\"]\n```\n\n---\n\n### `GET` `/{model}/schema`\n\nExample request:\n\n```\nGET /en/schema\n```\n\n| Name    | Type   | Description                                           |\n| ------- | ------ | ----------------------------------------------------- |\n| `model` | string | identifier string for a model installed on the server |\n\nExample response:\n\n```json\n{\n  \"dep_types\": [\"ROOT\", \"nsubj\"],\n  \"ent_types\": [\"PERSON\", \"LOC\", \"ORG\"],\n  \"pos_types\": [\"NN\", \"VBZ\", \"SP\"]\n}\n```\n\n---\n\n### `GET` `/version`\n\nShow the used spaCy version.\n\nExample request:\n\n```\nGET /version\n```\n\nExample response:\n\n```json\n{\n  \"spacy\": \"2.2.4\"\n}\n```\n","funding_links":[],"categories":["Python","restful-api"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjgontrum%2Fspacy-api-docker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjgontrum%2Fspacy-api-docker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjgontrum%2Fspacy-api-docker/lists"}