Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/paperswithcode/sotabench-eval
Easily evaluate machine learning models on public benchmarks
https://github.com/paperswithcode/sotabench-eval
Last synced: 3 days ago
JSON representation
Easily evaluate machine learning models on public benchmarks
- Host: GitHub
- URL: https://github.com/paperswithcode/sotabench-eval
- Owner: paperswithcode
- License: apache-2.0
- Created: 2019-09-17T13:27:53.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2024-03-20T15:45:32.000Z (8 months ago)
- Last Synced: 2024-10-11T14:38:46.119Z (29 days ago)
- Language: Python
- Size: 2.54 MB
- Stars: 171
- Watchers: 16
- Forks: 27
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
--------------------------------------------------------------------------------
[![PyPI version](https://badge.fury.io/py/sotabencheval.svg)](https://badge.fury.io/py/sotabencheval) [![Generic badge](https://img.shields.io/badge/Documentation-Here-.svg)](https://paperswithcode.github.io/sotabench-eval/)
`sotabencheval` is a framework-agnostic library that contains a collection of deep learning benchmarks you can use to benchmark your models. It can be used in conjunction with the [sotabench](https://www.sotabench.com) service to record results for models, so the community can compare model performance on different tasks, as well as a continuous integration style service for your repository to benchmark your models on each commit.
## Benchmarks Supported
- [ADE20K](https://paperswithcode.github.io/sotabench-eval/ade20k/) (Semantic Segmentation)
- [COCO](https://paperswithcode.github.io/sotabench-eval/coco/) (Object Detection)
- [ImageNet](https://paperswithcode.github.io/sotabench-eval/imagenet/) (Image Classification)
- [SQuAD](https://paperswithcode.github.io/sotabench-eval/squad/) (Question Answering)
- [WikiText-103](https://paperswithcode.github.io/sotabench-eval/wikitext103/) (Language Modelling)
- [WMT](https://paperswithcode.github.io/sotabench-eval/wmt/) (Machine Translation)PRs welcome for further benchmarks!
## Installation
Requires Python 3.6+.
```bash
pip install sotabencheval
```## Get Benching! 🏋️
You should read the [full documentation here](https://paperswithcode.github.io/sotabench-eval/index.html), which contains guidance on getting started and connecting to [sotabench](https://www.sotabench.com).
Integration is lightweight. For example, if you are evaluating an ImageNet model, you initialize an Evaluator object and (optionally) link to any linked paper:
```python
from sotabencheval.image_classification import ImageNetEvaluator
evaluator = ImageNetEvaluator(
model_name='FixResNeXt-101 32x48d',
paper_arxiv_id='1906.06423')
```Then for each batch of predictions your model makes on ImageNet, pass a dictionary of keys as image IDs and values as a `np.ndarray`s of logits to the `evaluator.add` method:
```python
evaluator.add(output_dict=dict(zip(image_ids, batch_output)))
```The evaluation logic just needs to be written in a `sotabench.py` file and sotabench will run it on each commit and record the results:
## Contributing
All contributions welcome!