{"id":13467979,"url":"https://github.com/explosion/spacy-models","last_synced_at":"2025-05-14T11:11:42.942Z","repository":{"id":18804256,"uuid":"84940268","full_name":"explosion/spacy-models","owner":"explosion","description":"💫  Models for the spaCy Natural Language Processing (NLP) library","archived":false,"fork":false,"pushed_at":"2024-09-30T10:06:39.000Z","size":7120,"stargazers_count":1725,"open_issues_count":0,"forks_count":305,"subscribers_count":45,"default_branch":"master","last_synced_at":"2025-04-09T22:09:18.797Z","etag":null,"topics":["machine-learning","machine-learning-models","models","natural-language-processing","nlp","spacy","spacy-models","statistical-models"],"latest_commit_sha":null,"homepage":"https://spacy.io","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/explosion.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"custom":"https://explosion.ai/merch"}},"created_at":"2017-03-14T11:15:20.000Z","updated_at":"2025-04-09T02:14:00.000Z","dependencies_parsed_at":"2023-02-12T22:01:17.736Z","dependency_job_id":"9a6d3c4d-bf5d-4e27-afc6-cc6b78a474d8","html_url":"https://github.com/explosion/spacy-models","commit_stats":{"total_commits":1988,"total_committers":16,"mean_commits":124.25,"dds":0.2600603621730382,"last_synced_commit":"6ddddbe1252d43ff1f485c4737441eede8cdd03d"},"previous_names":[],"tags_count":1099,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fspacy-models","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fspacy-models/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fspacy-models/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fspacy-models/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/explosion","download_url":"https://codeload.github.com/explosion/spacy-models/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248119294,"owners_count":21050755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning","machine-learning-models","models","natural-language-processing","nlp","spacy","spacy-models","statistical-models"],"created_at":"2024-07-31T15:01:03.598Z","updated_at":"2025-04-09T22:10:12.879Z","avatar_url":"https://github.com/explosion.png","language":"Python","readme":"\u003ca href=\"https://explosion.ai\"\u003e\u003cimg src=\"https://explosion.ai/assets/img/logo.svg\" width=\"125\" height=\"125\" align=\"right\" /\u003e\u003c/a\u003e\n\n# spaCy models\n\nThis repository contains\n[releases](https://github.com/explosion/spacy-models/releases) of models for\nthe [spaCy](https://github.com/explosion/spaCy) NLP library. For more info on\nhow to download, install and use the models, see the [models\ndocumentation](https://spacy.io/usage/models).\n\n\u003e **⚠️ Important note:** Because the models can be very large and consist mostly\n\u003e of binary data, we can't simply provide them as files in a GitHub repository.\n\u003e Instead, we've opted for adding them to\n\u003e [releases](https://github.com/explosion/spacy-models/releases) as `.whl` and\n\u003e `.tar.gz` files. This allows us to still maintain a public release history.\n\n## Quickstart\n\nTo install a specific model, run the following command with the model name (for\nexample `en_core_web_sm`):\n\n```bash\npython -m spacy download [model]\n```\n\n- [spaCy v3.x models directory](https://spacy.io/models)\n- [spaCy v3.x model comparison](https://spacy.io/usage/facts-figures#spacy-models)\n- [spaCy v2.x models directory](https://v2.spacy.io/models)\n- [spaCy v2.x model comparison](https://v2.spacy.io/usage/facts-figures#spacy-models)\n- [Individual release notes](https://github.com/explosion/spacy-models/releases)\n\nFor the spaCy v1.x models, [see here](#spacy-v1x-releases).\n\n## Model naming conventions\n\nIn general, spaCy expects all model packages to follow the naming convention of\n`[lang]_[name]`. For our provided pipelines, we divide the name into three\ncomponents:\n\n- **type**: Model capabilities:\n  - `core`: a general-purpose model with tagging, parsing, lemmatization and\n    named entity recognition\n  - `dep`: only tagging, parsing and lemmatization\n  - `ent`: only named entity recognition\n  - `sent`: only sentence segmentation\n- **genre**: Type of text the model is trained on (e.g. `web` for web text,\n  `news` for news text)\n- **size**: Model size indicator:\n  - `sm`: no word vectors\n  - `md`: reduced word vector table with 20k unique vectors for ~500k words\n  - `lg`: large word vector table with ~500k entries\n\nFor example, `en_core_web_md` is a medium-sized English model trained on\nwritten web text (blogs, news, comments), that includes a tagger, a dependency\nparser, a lemmatizer, a named entity recognizer and a word vector table with\n20k unique vectors.\n\n### Model versioning\n\nAdditionally, the model versioning reflects both the compatibility with spaCy,\nas well as the model version. A model version `a.b.c` translates to:\n\n- `a`: **spaCy major version**. For example, `2` for spaCy v2.x.\n- `b`: **spaCy minor version**. For example, `3` for spaCy v2.3.x.\n- `c`: **Model version.** Different model config: e.g. from being trained on\n  different data, with different parameters, for different numbers of\n  iterations, with different vectors, etc.\n\nFor a detailed compatibility overview, see the\n[`compatibility.json`](compatibility.json). This is also the source of spaCy's\ninternal compatibility check, performed when you run the `download` command.\n\n### Support for older versions\n\nIf you're using an older version (v1.6.0 or below), you can still download and\ninstall the old models from within spaCy using `python -m spacy.en.download all`\nor `python -m spacy.de.download all`. The `.tar.gz` archives are also\n[attached to the v1.6.0 release](https://github.com/explosion/spaCy/tree/v1.6.0).\nTo download and install the models manually, unpack the archive, drop the\ncontained directory into `spacy/data` and load the model via `spacy.load('en')`\nor `spacy.load('de')`.\n\n## Downloading models\n\nTo increase transparency and make it easier to use spaCy with your own models,\nall data is now available as direct downloads, organised in\n[individual releases](https://github.com/explosion/spacy-models/releases). spaCy\n1.7 also supports installing and loading models as **Python packages**. You can\nnow choose how and where you want to keep the data files, and set up \"shortcut\nlinks\" to load models by name from within spaCy. For more info on this, see the\nnew [models documentation](https://spacy.io/usage/models).\n\n```bash\n# download best-matching version of specific model for your spaCy installation\npython -m spacy download en_core_web_sm\n\n# pip install .whl or .tar.gz archive from path or URL\npip install /Users/you/en_core_web_sm-3.0.0.tar.gz\npip install /Users/you/en_core_web_sm-3.0.0-py3-none-any.whl\npip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0.tar.gz\npip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.0.0/en_core_web_sm-3.0.0-py3-none-any.whl\n```\n\n## Loading and using models\n\nTo load a model, use `spacy.load()` with the model name, a shortcut link or\na path to the model data directory.\n\n```python\nimport spacy\nnlp = spacy.load(\"en_core_web_sm\")\ndoc = nlp(u\"This is a sentence.\")\n```\n\nYou can also `import` a model directly via its full name and then call its\n`load()` method with no arguments. This should also work for older models\nin previous versions of spaCy.\n\n```python\nimport spacy\nimport en_core_web_sm\n\nnlp = en_core_web_sm.load()\ndoc = nlp(u\"This is a sentence.\")\n```\n\n## Manual download and installation\n\nIn some cases, you might prefer downloading the data manually, for example to\nplace it into a custom directory. You can download the model via your browser\nfrom the [latest releases](https://github.com/explosion/spacy-models/releases),\nor configure your own download script using the URL of the archive file. The\narchive consists of a model directory that contains another directory with the\nmodel data.\n\n```yaml\n└── en_core_web_md-3.0.0.tar.gz       # downloaded archive\n    ├── setup.py                      # setup file for pip installation\n    ├── meta.json                     # copy of pipeline meta\n    └── en_core_web_md                # 📦 pipeline package\n        ├── __init__.py               # init for pip installation\n        └── en_core_web_md-3.0.0      # pipeline data\n            ├── config.cfg            # pipeline config\n            ├── meta.json             # pipeline meta\n            └── ...                   # directories with component data\n```\n\n**📖 For more info and examples, check out the [models documentation](https://spacy.io/usage/models).**\n\n## spaCy v1.x Releases\n\n| Date         | Model                 | Version | Dep | Ent | Vec |    Size | License  |                                       |                                      |\n| ------------ | --------------------- | ------- | :-: | :-: | :-: | ------: | -------- | ------------------------------------- | ------------------------------------ |\n| `2017-06-06` | `es_core_web_md`      | 1.0.0   |  X  |  X  |  X  |  377 MB | CC BY-SA | [![][i]][i-es_core_web_md-1.0.0]      | [![][dl]][es_core_web_md-1.0.0]      |\n| `2017-04-26` | `fr_depvec_web_lg`    | 1.0.0   |  X  |     |  X  | 1.33 GB | CC BY-NC | [![][i]][i-fr_depvec_web_lg-1.0.0]    | [![][dl]][fr_depvec_web_lg-1.0.0]    |\n| `2017-03-21` | `en_core_web_md`      | 1.2.1   |  X  |  X  |  X  |    1 GB | CC BY-SA | [![][i]][i-en_core_web_md-1.2.1]      | [![][dl]][en_core_web_md-1.2.1]      |\n| `2017-03-21` | `en_depent_web_md`    | 1.2.1   |  X  |  X  |     |  328 MB | CC BY-SA | [![][i]][i-en_depent_web_md-1.2.1]    | [![][dl]][en_depent_web_md-1.2.1]    |\n| `2017-03-17` | `en_core_web_sm`      | 1.2.0   |  X  |  X  |  X  |   50 MB | CC BY-SA | [![][i]][i-en_core_web_sm-1.2.0]      | [![][dl]][en_core_web_sm-1.2.0]      |\n| `2017-03-17` | `en_core_web_md`      | 1.2.0   |  X  |  X  |  X  |    1 GB | CC BY-SA | [![][i]][i-en_core_web_md-1.2.0]      | [![][dl]][en_core_web_md-1.2.0]      |\n| `2017-03-17` | `en_depent_web_md`    | 1.2.0   |  X  |  X  |     |  328 MB | CC BY-SA | [![][i]][i-en_depent_web_md-1.2.0]    | [![][dl]][en_depent_web_md-1.2.0]    |\n| `2016-05-10` | `de_core_news_md`     | 1.0.0   |  X  |  X  |  X  |  645 MB | CC BY-SA | [![][i]][i-de_core_news_md-1.0.0]     | [![][dl]][de_core_news_md-1.0.0]     |\n| `2016-03-08` | `en_vectors_glove_md` | 1.0.0   |     |     |  X  |  727 MB | CC BY-SA | [![][i]][i-en_vectors_glove_md-1.0.0] | [![][dl]][en_vectors_glove_md-1.0.0] |\n\n[es_core_web_md-1.0.0]: https://github.com/explosion/spacy-models/releases/download/es_core_web_md-1.0.0/es_core_web_md-1.0.0.tar.gz\n[fr_depvec_web_lg-1.0.0]: https://github.com/explosion/spacy-models/releases/download/fr_depvec_web_lg-1.0.0/fr_depvec_web_lg-1.0.0.tar.gz\n[en_core_web_md-1.2.1]: https://github.com/explosion/spacy-models/releases/download/en_core_web_md-1.2.1/en_core_web_md-1.2.1.tar.gz\n[en_depent_web_md-1.2.1]: https://github.com/explosion/spacy-models/releases/download/en_depent_web_md-1.2.1/en_depent_web_md-1.2.1.tar.gz\n[en_core_web_sm-1.2.0]: https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-1.2.0/en_core_web_sm-1.2.0.tar.gz\n[en_core_web_md-1.2.0]: https://github.com/explosion/spacy-models/releases/download/en_core_web_md-1.2.0/en_core_web_md-1.2.0.tar.gz\n[en_depent_web_md-1.2.0]: https://github.com/explosion/spacy-models/releases/download/en_depent_web_md-1.2.0/en_depent_web_md-1.2.0.tar.gz\n[de_core_news_md-1.0.0]: https://github.com/explosion/spacy-models/releases/download/de_core_news_md-1.0.0/de_core_news_md-1.0.0.tar.gz\n[en_vectors_glove_md-1.0.0]: https://github.com/explosion/spacy-models/releases/download/en_vectors_glove_md-1.0.0/en_vectors_glove_md-1.0.0.tar.gz\n[i-es_core_web_md-1.0.0]: https://github.com/explosion/spacy-models/releases/es_core_web_md-1.0.0\n[i-fr_depvec_web_lg-1.0.0]: https://github.com/explosion/spacy-models/releases/fr_depvec_web_lg-1.0.0\n[i-en_core_web_md-1.2.1]: https://github.com/explosion/spacy-models/releases/en_core_web_md-1.2.1\n[i-en_depent_web_md-1.2.1]: https://github.com/explosion/spacy-models/releases/en_depent_web_md-1.2.1\n[i-en_core_web_sm-1.2.0]: https://github.com/explosion/spacy-models/releases/en_core_web_sm-1.2.0\n[i-en_core_web_md-1.2.0]: https://github.com/explosion/spacy-models/releases/en_core_web_md-1.2.0\n[i-en_depent_web_md-1.2.0]: https://github.com/explosion/spacy-models/releases/en_depent_web_md-1.2.0\n[i-de_core_news_md-1.0.0]: https://github.com/explosion/spacy-models/releases/de_core_news_md-1.0.0\n[i-en_vectors_glove_md-1.0.0]: https://github.com/explosion/spacy-models/releases/en_vectors_glove_md-1.0.0\n[dl]: http://i.imgur.com/gQvPgr0.png\n[i]: http://i.imgur.com/OpLOcKn.png\n\n### Model naming conventions for v1.x models\n\n- **type**: Model capabilities (e.g. `core` for general-purpose model with\n  vocabulary, syntax, entities and word vectors, or `depent` for only vocab,\n  syntax and entities)\n- **genre**: Type of text the model is trained on (e.g. `web` for web text,\n  `news` for news text)\n- **size**: Model size indicator (`sm`, `md` or `lg`)\n\nFor example, `en_depent_web_md` is a medium-sized English model trained on\nwritten web text (blogs, news, comments), that includes vocabulary, syntax and\nentities.\n\n## Issues and bug reports\n\nTo report an issue with a model, please open an issue on the\n[spaCy issue tracker](https://github.com/explosion/spaCy).\nPlease note that no model is perfect. Because models are statistical, their\nexpected behaviour **will always include some errors**. However, particular\nerrors can indicate deeper issues with the training feature extraction or\noptimisation code. If you come across patterns in the model's performance that\nseem suspicious, please do file a report.\n","funding_links":["https://explosion.ai/merch"],"categories":["Python"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexplosion%2Fspacy-models","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fexplosion%2Fspacy-models","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexplosion%2Fspacy-models/lists"}