https://github.com/xl0/ready-steady-go

Deep Learning GPU benchmark
https://github.com/xl0/ready-steady-go

Last synced: 3 months ago
JSON representation

Deep Learning GPU benchmark

Host: GitHub
URL: https://github.com/xl0/ready-steady-go
Owner: xl0
License: mit
Created: 2022-10-11T14:01:53.000Z (almost 3 years ago)
Default Branch: master
Last Pushed: 2022-10-20T13:16:29.000Z (over 2 years ago)
Last Synced: 2025-02-10T02:48:02.524Z (5 months ago)
Language: Jupyter Notebook
Homepage: https://xl0.github.io/ready-steady-go
Size: 1.88 MB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        Ready, Steady, Go!

================

It’s fall 2022, and for the first time in years, buying a GPU for Deep

Learning experiments does not sound too crazy.

Now, how do we pick one?

> Keep in mind, performance depends on many factors, not least your CPU

> and often SSD.  

> For experiments, you might be better off with 2 cheaper GPUs - one to

> run in background, the other used interactively.

# My Results

``` python

import wandb

import pandas as pd

from matplotlib import pyplot as plt

import seaborn as sns

import numpy as np

```

I ran the benchmark on a variety of GPUs from

[vast.ai](https://vast.ai). The results are automatically synced to

[Weights & Biases](https://wandb.ai).

``` python

sns.set_theme(style="whitegrid")

sns.set_color_codes("pastel")

api = wandb.Api()

runs = api.runs("xl0/ready-steady-go")

summaries = [ dict(r.summary) | {"id": r.id} for r in runs if r.state == "finished"]

df = pd.DataFrame.from_records(summaries)

df = df[["device_name", "model", "bs", "fp16", "throughput"]]

df["fp16"] = df["fp16"].apply(lambda x: "FP16" if x else "FP32")

df = df.replace({"device_name" : {

                        "NVIDIA*": "",

                        "GeForce": "",

                        "Tesla": "",

                        "-": " "}}, regex=True)

df.dropna(inplace=True)

# For each model, normalize performance by top throughput.

for model in df.model.unique():

    df.loc[ df.model == model, "throughput"] /= df.loc[df.model == model, "throughput"].max()

df["throughput"] *= 100

```

``` python

# For each device+model+fp, get the index of the entry with the highest throughput.

max_bs_idx = df.groupby(["device_name", "model", "fp16"])["throughput"].idxmax()

for model in df.model.unique():   

    f, ax = plt.subplots(figsize=(15, 6))

    

    tops = df.loc[max_bs_idx].query(f"model == '{model}'").sort_values("throughput", ascending=False)

    sns.set_color_codes("pastel")

    sns.barplot(ax=ax, data=tops.query("fp16 == 'FP16'"),

                x="throughput", y="device_name", label="FP16", color="b", alpha=1)

    sns.set_color_codes("muted")

    sns.barplot(ax=ax, data=tops.query("fp16 == 'FP32'"),

                x="throughput", y="device_name", label="FP32", color="b", alpha=0.8,

                order=tops.query("fp16 == 'FP16'").sort_values("throughput", ascending=False).device_name)

    ax.legend(ncol=2, loc="lower right", frameon=True)

    ax.set(ylabel=None, xlabel=None, title=model)

```

![](index_files/figure-gfm/cell-4-output-1.svg)

![](index_files/figure-gfm/cell-4-output-2.svg)

![](index_files/figure-gfm/cell-4-output-3.svg)

``` python

f, ax = plt.subplots(figsize=(15, 6))

tops = df.loc[max_bs_idx].sort_values("throughput", ascending=False)

# tops

fp16s = df.loc[max_bs_idx].query("fp16=='FP16'")

grouped = fp16s.groupby(["device_name"], as_index=False)["throughput"]

display_order = grouped.mean().sort_values("throughput", ascending=False)

sns.set_color_codes("pastel")

tt= sns.barplot(ax=ax, data=tops.loc[tops.fp16.eq('FP16')],

            x="throughput", y="device_name", label="FP16", color="b", errwidth=0,

            order=display_order.device_name)

# f, ax = plt.subplots(figsize=(15, 6))

sns.set_color_codes("muted")

sns.barplot(ax=ax, data=tops.loc[tops.fp16.eq('FP32')],

            x="throughput", y="device_name", label="FP32", color="b", errwidth=0,

            order=display_order.device_name)

        

ax.legend(ncol=2, loc="lower right", frameon=True)

_ = ax.set(ylabel=None, xlabel=None, title="Average between all models")

```

![](index_files/figure-gfm/cell-5-output-1.svg)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/xl0/ready-steady-go

Awesome Lists containing this project

README