Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/xl0/ready-steady-go
Deep Learning GPU benchmark
https://github.com/xl0/ready-steady-go
Last synced: 17 days ago
JSON representation
Deep Learning GPU benchmark
- Host: GitHub
- URL: https://github.com/xl0/ready-steady-go
- Owner: xl0
- License: mit
- Created: 2022-10-11T14:01:53.000Z (about 2 years ago)
- Default Branch: master
- Last Pushed: 2022-10-20T13:16:29.000Z (about 2 years ago)
- Last Synced: 2024-11-18T04:14:20.965Z (about 2 months ago)
- Language: Jupyter Notebook
- Homepage: https://xl0.github.io/ready-steady-go
- Size: 1.88 MB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Ready, Steady, Go!
================It’s fall 2022, and for the first time in years, buying a GPU for Deep
Learning experiments does not sound too crazy.Now, how do we pick one?
> Keep in mind, performance depends on many factors, not least your CPU
> and often SSD.
> For experiments, you might be better off with 2 cheaper GPUs - one to
> run in background, the other used interactively.# My Results
``` python
import wandb
import pandas as pd
from matplotlib import pyplot as plt
import seaborn as sns
import numpy as np
```I ran the benchmark on a variety of GPUs from
[vast.ai](https://vast.ai). The results are automatically synced to
[Weights & Biases](https://wandb.ai).``` python
sns.set_theme(style="whitegrid")
sns.set_color_codes("pastel")api = wandb.Api()
runs = api.runs("xl0/ready-steady-go")
summaries = [ dict(r.summary) | {"id": r.id} for r in runs if r.state == "finished"]df = pd.DataFrame.from_records(summaries)
df = df[["device_name", "model", "bs", "fp16", "throughput"]]
df["fp16"] = df["fp16"].apply(lambda x: "FP16" if x else "FP32")
df = df.replace({"device_name" : {
"NVIDIA*": "",
"GeForce": "",
"Tesla": "",
"-": " "}}, regex=True)
df.dropna(inplace=True)# For each model, normalize performance by top throughput.
for model in df.model.unique():
df.loc[ df.model == model, "throughput"] /= df.loc[df.model == model, "throughput"].max()
df["throughput"] *= 100
`````` python
# For each device+model+fp, get the index of the entry with the highest throughput.
max_bs_idx = df.groupby(["device_name", "model", "fp16"])["throughput"].idxmax()for model in df.model.unique():
f, ax = plt.subplots(figsize=(15, 6))
tops = df.loc[max_bs_idx].query(f"model == '{model}'").sort_values("throughput", ascending=False)sns.set_color_codes("pastel")
sns.barplot(ax=ax, data=tops.query("fp16 == 'FP16'"),
x="throughput", y="device_name", label="FP16", color="b", alpha=1)sns.set_color_codes("muted")
sns.barplot(ax=ax, data=tops.query("fp16 == 'FP32'"),
x="throughput", y="device_name", label="FP32", color="b", alpha=0.8,
order=tops.query("fp16 == 'FP16'").sort_values("throughput", ascending=False).device_name)ax.legend(ncol=2, loc="lower right", frameon=True)
ax.set(ylabel=None, xlabel=None, title=model)
```![](index_files/figure-gfm/cell-4-output-1.svg)
![](index_files/figure-gfm/cell-4-output-2.svg)
![](index_files/figure-gfm/cell-4-output-3.svg)
``` python
f, ax = plt.subplots(figsize=(15, 6))tops = df.loc[max_bs_idx].sort_values("throughput", ascending=False)
# topsfp16s = df.loc[max_bs_idx].query("fp16=='FP16'")
grouped = fp16s.groupby(["device_name"], as_index=False)["throughput"]display_order = grouped.mean().sort_values("throughput", ascending=False)
sns.set_color_codes("pastel")
tt= sns.barplot(ax=ax, data=tops.loc[tops.fp16.eq('FP16')],
x="throughput", y="device_name", label="FP16", color="b", errwidth=0,
order=display_order.device_name)# f, ax = plt.subplots(figsize=(15, 6))
sns.set_color_codes("muted")
sns.barplot(ax=ax, data=tops.loc[tops.fp16.eq('FP32')],
x="throughput", y="device_name", label="FP32", color="b", errwidth=0,
order=display_order.device_name)
ax.legend(ncol=2, loc="lower right", frameon=True)
_ = ax.set(ylabel=None, xlabel=None, title="Average between all models")
```![](index_files/figure-gfm/cell-5-output-1.svg)