Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/explosion/spacy-benchmarks
💫 Runtime performance comparison of spaCy against other NLP libraries
https://github.com/explosion/spacy-benchmarks
benchmarking benchmarks natural-language-processing nlp spacy
Last synced: 3 months ago
JSON representation
💫 Runtime performance comparison of spaCy against other NLP libraries
- Host: GitHub
- URL: https://github.com/explosion/spacy-benchmarks
- Owner: explosion
- Archived: true
- Created: 2015-01-07T00:09:11.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2022-08-31T14:31:52.000Z (over 2 years ago)
- Last Synced: 2024-09-21T09:32:46.729Z (3 months ago)
- Topics: benchmarking, benchmarks, natural-language-processing, nlp, spacy
- Language: Python
- Homepage: https://spacy.io
- Size: 22.5 KB
- Stars: 20
- Watchers: 4
- Forks: 12
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Runtime performance comparison of spaCy against other NLP libraries
> ⚠️ **This repository is old and deprecated.** For up-to-date benchmark scripts, see the [`projects`](https://github.com/explosion/projects/) repo.
## Set up the corpus DB
The speed test expects to read documents from a simple SQLite table. More corpus
injestors need to be written. So far there's one to create the table from the Gigaword
corpus.```bash
fab corpus.giga:path_to_gigaword/
```## Set up the tools
```bash
fab init
```This should download and install spaCy and other NLP libraries.
## Run a benchmark
```bash
fab speed:parse,spacy,n=1000
fab speed:tag,spacy
fab speed:tag,spacy,nltk,n=10000
fab speed:tokenize,spacy,clearnlp
```