Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/explosion/spacy-benchmarks

💫 Runtime performance comparison of spaCy against other NLP libraries
https://github.com/explosion/spacy-benchmarks

benchmarking benchmarks natural-language-processing nlp spacy

Last synced: 3 months ago
JSON representation

💫 Runtime performance comparison of spaCy against other NLP libraries

Awesome Lists containing this project

README

        

# Runtime performance comparison of spaCy against other NLP libraries

> ⚠️ **This repository is old and deprecated.** For up-to-date benchmark scripts, see the [`projects`](https://github.com/explosion/projects/) repo.

## Set up the corpus DB

The speed test expects to read documents from a simple SQLite table. More corpus
injestors need to be written. So far there's one to create the table from the Gigaword
corpus.

```bash
fab corpus.giga:path_to_gigaword/
```

## Set up the tools

```bash
fab init
```

This should download and install spaCy and other NLP libraries.

## Run a benchmark

```bash
fab speed:parse,spacy,n=1000
fab speed:tag,spacy
fab speed:tag,spacy,nltk,n=10000
fab speed:tokenize,spacy,clearnlp
```