Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/avinassh/fast-sqlite3-inserts
Some bunch of test scripts to generate a SQLite DB with 1B rows in fastest possible way
https://github.com/avinassh/fast-sqlite3-inserts
Last synced: 1 day ago
JSON representation
Some bunch of test scripts to generate a SQLite DB with 1B rows in fastest possible way
- Host: GitHub
- URL: https://github.com/avinassh/fast-sqlite3-inserts
- Owner: avinassh
- License: mit
- Created: 2021-05-08T06:31:42.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-04-13T05:26:44.000Z (almost 2 years ago)
- Last Synced: 2025-01-06T20:17:34.056Z (9 days ago)
- Language: Rust
- Size: 33.2 KB
- Stars: 383
- Watchers: 14
- Forks: 38
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - avinassh/fast-sqlite3-inserts
README
# Fast SQLite Inserts
To find out the fastest way to create an SQLite DB with one billion random rows.
Read this blog post for the more context - [Towards Inserting One Billion Rows in SQLite Under A Minute](https://avi.im/blag/2021/fast-sqlite-inserts/)
## Leaderboard
(for 100M insertions)
Variant | Time
------------- | -------------
Rust | 23 seconds
PyPy | 126 seconds
CPython | 210 seconds## Current Benchmark
### Python
These are the current fastest CPython and PyPy numbers.
```shell
$ ./bench.shSat May 8 19:42:44 IST 2021 [PYTHON] running sqlite3_opt_batched.py (100_000_000) inserts
517.53 real 508.24 user 7.35 sysSat May 8 20:03:04 IST 2021 [PYPY] running sqlite3_opt_batched.py (100_000_000) inserts
159.70 real 153.46 user 5.81 sys
```### Rust
These are the current fastest Rust numbers
```
Mon Nov 22 18:47:26 IST 2021 [RUST] basic_batched.rs (100_000_000) insertsreal 0m23.826s
user 0m21.685s
sys 0m2.057sMon Nov 22 18:47:50 IST 2021 [RUST] threaded_batched.rs (100_000_000) inserts
real 0m23.070s
user 0m27.512s
sys 0m2.465s
```### In Memory
Instead of writing to disk, I used a `:memory:` DB, these are the numbers
```
Mon May 10 17:40:39 IST 2021 [RUST] basic_batched.rs (100_000_000) inserts
31.38 real 30.55 user 0.56 sysMon May 10 17:39:39 IST 2021 [RUST] threaded_batched.rs (100_000_000) inserts
28.94 real 45.02 user 2.03 sys
```### Busy loop time
The amount of time these scripts were taking in just to run the for loops (and no SQL insertion)
```
$ ./busy.shSun May 9 13:16:01 IST 2021 [PYTHON] busy_loop.py (100_000_000) iterations
351.14 real 347.53 user 3.39 sysSun May 9 13:21:52 IST 2021 [PYPY] busy_loop.py (100_000_000) iterations
81.58 real 77.73 user 3.80 sysSun May 9 13:23:14 IST 2021 [RUST] busy.rs (100_000_000) iterations
17.97 real 16.29 user 1.67 sysSun May 9 13:23:32 IST 2021 [RUST] threaded_busy.rs (100_000_000) iterations
7.18 real 42.52 user 7.20 sys
```## Community Contributions
| PR | Author | Result |
|---|---|---|
| [#2](https://github.com/avinassh/fast-sqlite3-inserts/pull/2) | [captn3m0](https://github.com/captn3m0) | Reduced the CPython running time by half (from 7.5 minutes to 3.5 minute) |
| [#12](https://github.com/avinassh/fast-sqlite3-inserts/pull/12) | [red15](https://github.com/red15) | saved 2s from Rust's running time (bringing it to 30s) |
| [#19](https://github.com/avinassh/fast-sqlite3-inserts/pull/19) | [kerollmops](https://github.com/Kerollmops) | saved 5s from Rust's running time (bringing it to 23s) |## Contributing
All contributions are welcome. If you have any ideas on increasing the performance, feel free to submit a PR. You may also check the current open issues to work on.
## License
Released under MIT License. Check `LICENSE` file more info.