An open API service indexing awesome lists of open source software.

https://github.com/mtli/dlgpubench

Code for Deep Learning GPU Benchmark: A Latency-Based Approach :watch:
https://github.com/mtli/dlgpubench

Last synced: about 1 year ago
JSON representation

Code for Deep Learning GPU Benchmark: A Latency-Based Approach :watch:

Awesome Lists containing this project

README

          

# Code for Deep Learning GPU Benchmark: A Latency-Based Approach

demo

(Note that the above is a screenshot of the benchmark. Please visit the [project page](https://mtli.github.io/gpubench/) for the latest version and an interactive experience.)

![#fc4903](https://via.placeholder.com/15/fc4903/000000?text=+) Helps to estimate the runtime of algorithms on a different GPU

![#4abdab](https://via.placeholder.com/15/4abdab/000000?text=+) Measures GPU processing speed independent of GPU memory capacity

![#F7B733](https://via.placeholder.com/15/F7B733/000000?text=+) Contains adjustable weightings through interactive UIs

This repo contains the timing scripts used in the GPU benchmark. This latency-based benchmark is designed to compare algorithms with runtime reported under different GPUs, and it also serves as a GPU purchasing guide. Please check out the [project page](https://mtli.github.io/gpubench/) for the complete benchmark with detailed descriptions. This page documents instructions on how to run the code and the changelog of the benchmark.

## Setting Up

```
git clone https://github.com/mtli/DLGPUBench.git
cd DLGPUBench
conda env create -f environment.yml
conda activate bench
```

Download and unpack ImageNet (ILSVRC2012) and MS COCO. For running the detection scripts, you also need to download the pretrained model from [this link](https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth). Modify the dataset paths in each script you plan to run.

## Changelog

### Version 1.1
- Update the timing setting for classification by excluding the time spent on GPU-host data transfer, and disabling multi-threading to make timing more stable and faster.
- Update to work with llcv 0.0.9.
- Change the default batch size for classification inference to 64
- Add results for GTX 1080 and RTX A6000.