{"id":13608652,"url":"https://github.com/explosion/spacy-transformers","last_synced_at":"2025-04-25T14:49:20.149Z","repository":{"id":37271576,"uuid":"199068120","full_name":"explosion/spacy-transformers","owner":"explosion","description":"🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy","archived":false,"fork":false,"pushed_at":"2025-02-06T11:15:50.000Z","size":1208,"stargazers_count":1382,"open_issues_count":0,"forks_count":172,"subscribers_count":30,"default_branch":"master","last_synced_at":"2025-04-24T08:55:13.355Z","etag":null,"topics":["bert","google","gpt-2","huggingface","language-model","machine-learning","natural-language-processing","natural-language-understanding","nlp","openai","pytorch","pytorch-model","spacy","spacy-extension","spacy-pipeline","transfer-learning","xlnet"],"latest_commit_sha":null,"homepage":"https://spacy.io/usage/embeddings-transformers","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/explosion.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-07-26T19:12:34.000Z","updated_at":"2025-04-23T16:04:51.000Z","dependencies_parsed_at":"2023-12-19T07:43:44.854Z","dependency_job_id":"38294fc7-3737-4c9e-b817-3baa2f24321e","html_url":"https://github.com/explosion/spacy-transformers","commit_stats":{"total_commits":1312,"total_committers":26,"mean_commits":50.46153846153846,"dds":"0.36966463414634143","last_synced_commit":"128bb2c3be37d26e235a92147bf845029dfe4220"},"previous_names":["explosion/spacy-pytorch-transformers"],"tags_count":56,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fspacy-transformers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fspacy-transformers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fspacy-transformers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/explosion%2Fspacy-transformers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/explosion","download_url":"https://codeload.github.com/explosion/spacy-transformers/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250838475,"owners_count":21495756,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","google","gpt-2","huggingface","language-model","machine-learning","natural-language-processing","natural-language-understanding","nlp","openai","pytorch","pytorch-model","spacy","spacy-extension","spacy-pipeline","transfer-learning","xlnet"],"created_at":"2024-08-01T19:01:28.949Z","updated_at":"2025-04-25T14:49:20.128Z","avatar_url":"https://github.com/explosion.png","language":"Python","funding_links":[],"categories":["Transformer Implementations By Communities","Natural Language Processing","Implementations","Python","文本数据和NLP","NLP"],"sub_categories":["PyTorch and TensorFlow","General Purpose NLP"],"readme":"\u003ca href=\"https://explosion.ai\"\u003e\u003cimg src=\"https://explosion.ai/assets/img/logo.svg\" width=\"125\" height=\"125\" align=\"right\" /\u003e\u003c/a\u003e\n\n# spacy-transformers: Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy\n\nThis package provides [spaCy](https://github.com/explosion/spaCy) components and\narchitectures to use transformer models via\n[Hugging Face's `transformers`](https://github.com/huggingface/transformers) in\nspaCy. The result is convenient access to state-of-the-art transformer\narchitectures, such as BERT, GPT-2, XLNet, etc.\n\n\u003e **This release requires [spaCy v3](https://spacy.io/usage/v3).** For the\n\u003e previous version of this library, see the\n\u003e [`v0.6.x` branch](https://github.com/explosion/spacy-transformers/tree/v0.6.x).\n\n[![tests](https://github.com/explosion/spacy-transformers/actions/workflows/tests.yml/badge.svg)](https://github.com/explosion/spacy-transformers/actions/workflows/tests.yml)\n[![PyPi](https://img.shields.io/pypi/v/spacy-transformers.svg?style=flat-square\u0026logo=pypi\u0026logoColor=white)](https://pypi.python.org/pypi/spacy-transformers)\n[![GitHub](https://img.shields.io/github/release/explosion/spacy-transformers/all.svg?style=flat-square\u0026logo=github)](https://github.com/explosion/spacy-transformers/releases)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg?style=flat-square)](https://github.com/ambv/black)\n\n## Features\n\n- Use pretrained transformer models like **BERT**, **RoBERTa** and **XLNet** to\n  power your spaCy pipeline.\n- Easy **multi-task learning**: backprop to one transformer model from several\n  pipeline components.\n- Train using spaCy v3's powerful and extensible config system.\n- Automatic alignment of transformer output to spaCy's tokenization.\n- Easily customize what transformer data is saved in the `Doc` object.\n- Easily customize how long documents are processed.\n- Out-of-the-box serialization and model packaging.\n\n## 🚀 Installation\n\nInstalling the package from pip will automatically install all dependencies,\nincluding PyTorch and spaCy. Make sure you install this package **before** you\ninstall the models. Also note that this package requires **Python 3.6+**,\n**PyTorch v1.5+** and **spaCy v3.0+**.\n\n```bash\npip install 'spacy[transformers]'\n```\n\nFor GPU installation, find your CUDA version using `nvcc --version` and add the\n[version in brackets](https://spacy.io/usage/#gpu), e.g.\n`spacy[transformers,cuda92]` for CUDA9.2 or `spacy[transformers,cuda100]` for\nCUDA10.0.\n\nIf you are having trouble installing PyTorch, follow the\n[instructions](https://pytorch.org/get-started/locally/) on the official website\nfor your specific operating system and requirements.\n\n## 📖 Documentation\n\n\u003e ⚠️ **Important note:** This package has been extensively refactored to take\n\u003e advantage of [spaCy v3.0](https://spacy.io). Previous versions that were built\n\u003e for [spaCy v2.x](https://v2.spacy.io) worked considerably differently. Please\n\u003e see previous tagged versions of this README for documentation on prior\n\u003e versions.\n\n- 📘\n  [Embeddings, Transformers and Transfer Learning](https://spacy.io/usage/embeddings-transformers):\n  How to use transformers in spaCy\n- 📘 [Training Pipelines and Models](https://spacy.io/usage/training): Train and\n  update components on your own data and integrate custom models\n- 📘\n  [Layers and Model Architectures](https://spacy.io/usage/layers-architectures):\n  Power spaCy components with custom neural networks\n- 📗 [`Transformer`](https://spacy.io/api/transformer): Pipeline component API\n  reference\n- 📗\n  [Transformer architectures](https://spacy.io/api/architectures#transformers):\n  Architectures and registered functions\n\n## Applying pretrained text and token classification models\n\nNote that the `transformer` component from `spacy-transformers` does not support\ntask-specific heads like token or text classification. A task-specific\ntransformer model can be used as a source of features to train spaCy components\nlike `ner` or `textcat`, but the `transformer` component does not provide access\nto task-specific heads for training or inference.\n\nAlternatively, if you only want use to the **predictions** from an existing\nHugging Face text or token classification model, you can use the wrappers from\n[`spacy-huggingface-pipelines`](https://github.com/explosion/spacy-huggingface-pipelines)\nto incorporate task-specific transformer models into your spaCy pipelines.\n\n## Bug reports and other issues\n\nPlease use [spaCy's issue tracker](https://github.com/explosion/spaCy/issues) to\nreport a bug, or open a new thread on the\n[discussion board](https://github.com/explosion/spaCy/discussions) for any other\nissue.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexplosion%2Fspacy-transformers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fexplosion%2Fspacy-transformers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexplosion%2Fspacy-transformers/lists"}