Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/eatonphil/io-playground

Building an intuition for different IO models (sync, io_uring, etc.) across different languages.
https://github.com/eatonphil/io-playground

Last synced: 23 days ago
JSON representation

Building an intuition for different IO models (sync, io_uring, etc.) across different languages.

Host: GitHub
URL: https://github.com/eatonphil/io-playground
Owner: eatonphil
Created: 2023-10-16T20:48:39.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2023-10-19T22:25:59.000Z (about 1 year ago)
Last Synced: 2024-10-13T04:13:05.959Z (about 1 month ago)
Language: Zig
Homepage:
Size: 27.3 KB
Stars: 25
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# IO Playground

The point of this repo is to get an intuition about different IO
models. Comparing across languages isn't for benchmark wars it's more
a check for correctness. The relative difference between IO models
should be similar across language.

## Machine

I am running these tests on a dedicated bare metal instance, [OVH
Rise-1](https://eco.us.ovhcloud.com/#filterType=range_element&filterValue=rise).

* RAM: 64 GB DDR4 ECC 2,133 MHz
* Disk: 2x450 GB SSD NVMe in Soft RAID
* Processor: Intel Xeon E3-1230v6 - 4c/8t - 3.5 GHz/3.9 GHz
* `uname --kernel-release`: 6.3.8-100.fc37.x86_64

## Write 1GiB to one file (4KiB buffer)

Each implementation (other than `dd`) produces a CSV of results. Use
the following DuckDB command to analyze it.

```
$ duckdb -c "
SELECT
column0 AS method,
AVG(column1::DOUBLE) || 's' avg_time,
FORMAT_BYTES(AVG(column2::DOUBLE)::BIGINT) || '/s' AS avg_throughput
FROM 'out.csv'
GROUP BY column0
ORDER BY AVG(column1::DOUBLE) ASC"
```

### `dd` (Control)

```
$ dd if=/dev/zero of=test.bin bs=4k count=1M
1048576+0 records in
1048576+0 records out
4294967296 bytes (4.3 GB, 4.0 GiB) copied, 3.09765 s, 1.4 GB/s
```

### Go

This version reads up to N entries but then blocks until all N entries
complete. This is not ideal. But I'm not sure Iceber/iouring-go
supports anything else.

First:

```
$ go run main.go | tee out.csv
```

Then run the DuckDB command from above:

```
┌────────────────────────────────────────────┬─────────────────────┬────────────────┐
│ method │ avg_time │ avg_throughput │
│ varchar │ varchar │ varchar │
├────────────────────────────────────────────┼─────────────────────┼────────────────┤
│ 1_goroutines_pwrite │ 0.7111268999999999s │ 1.5GB/s │
│ blocking │ 0.7128968s │ 1.5GB/s │
│ 1_goroutines_io_uring_pwrite_128_entries │ 1.0402713s │ 1.0GB/s │
│ 10_goroutines_pwrite │ 1.111215s │ 966.2MB/s │
│ 100_goroutines_io_uring_pwrite_128_entries │ 1.3004915000000001s │ 825.6MB/s │
│ 100_goroutines_io_uring_pwrite_1_entries │ 1.5118257s │ 710.2MB/s │
│ 10_goroutines_io_uring_pwrite_128_entries │ 1.5322980999999998s │ 771.6MB/s │
│ 10_goroutines_io_uring_pwrite_1_entries │ 1.6577722000000001s │ 648.1MB/s │
│ 1_goroutines_io_uring_pwrite_1_entries │ 4.705483s │ 228.2MB/s │
└────────────────────────────────────────────┴─────────────────────┴────────────────┘
```

### Zig

Unlike the Go implementation, this version does not batch only/always
N entries at a time. It starts out batching N entries at a time but
*does not block* waiting for all N to complete. Instead it just reads
what has completed and keeps trying to add more entry batches.

First:

```
$ zig build-exe main.zig
$ ./main | tee out.csv
```

Then run the DuckDB command from above:

```
┌────────────────────────────────────────┬─────────────────────┬────────────────┐
│ method │ avg_time │ avg_throughput │
│ varchar │ varchar │ varchar │
├────────────────────────────────────────┼─────────────────────┼────────────────┤
│ 1_threads_iouring_pwrite_128_entries │ 0.6080365773999998s │ 1.7GB/s │
│ 1_threads_iouring_pwrite_1_entries │ 0.6259650676999999s │ 1.7GB/s │
│ blocking │ 0.6740227804s │ 1.5GB/s │
│ 1_threads_pwrite │ 0.6846085126999999s │ 1.5GB/s │
│ 10_threads_pwrite │ 1.1549885629000003s │ 929.8MB/s │
│ 10_threads_iouring_pwrite_1_entries │ 2.4174379148s │ 445.7MB/s │
│ 10_threads_iouring_pwrite_128_entries │ 2.4178504731s │ 445.8MB/s │
│ 100_threads_iouring_pwrite_128_entries │ 3.6317807736s │ 296.6MB/s │
│ 100_threads_iouring_pwrite_1_entries │ 3.7681755905000003s │ 287.7MB/s │
└────────────────────────────────────────┴─────────────────────┴────────────────┘
```

### Python

First:

```
$ python3 main.py | tee out.csv
```

Then run the DuckDB command from above:

```
┌──────────┬────────────┬────────────────┐
│ method │ avg_time │ avg_throughput │
│ varchar │ varchar │ varchar │
├──────────┼────────────┼────────────────┤
│ blocking │ 0.9259369s │ 1.1GB/s │
└──────────┴────────────┴────────────────┘
```