Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/eliaskosunen/scnlib
scanf for modern C++
https://github.com/eliaskosunen/scnlib
c-plus-plus cpp input io parsing ranges scanf
Last synced: 2 days ago
JSON representation
scanf for modern C++
- Host: GitHub
- URL: https://github.com/eliaskosunen/scnlib
- Owner: eliaskosunen
- License: apache-2.0
- Created: 2018-11-17T01:48:40.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2024-11-07T21:37:07.000Z (2 months ago)
- Last Synced: 2025-01-10T11:10:22.171Z (9 days ago)
- Topics: c-plus-plus, cpp, input, io, parsing, ranges, scanf
- Language: C++
- Homepage: https://scnlib.dev
- Size: 6.85 MB
- Stars: 1,100
- Watchers: 17
- Forks: 49
- Open Issues: 15
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
- awesomecpp - scnlib - - scanf for modern C++ (Text Handling)
- AwesomeCppGameDev - scnlib
README
# scnlib
[![Linux builds](https://github.com/eliaskosunen/scnlib/actions/workflows/linux.yml/badge.svg)](https://github.com/eliaskosunen/scnlib/actions/workflows/linux.yml)
[![macOS builds](https://github.com/eliaskosunen/scnlib/actions/workflows/macos.yml/badge.svg)](https://github.com/eliaskosunen/scnlib/actions/workflows/macos.yml)
[![Windows builds](https://github.com/eliaskosunen/scnlib/actions/workflows/windows.yml/badge.svg)](https://github.com/eliaskosunen/scnlib/actions/workflows/windows.yml)
[![Other architectures](https://github.com/eliaskosunen/scnlib/actions/workflows/arch.yml/badge.svg)](https://github.com/eliaskosunen/scnlib/actions/workflows/arch.yml)
[![Code Coverage](https://codecov.io/gh/eliaskosunen/scnlib/graph/badge.svg?token=LyWrDluna1)](https://codecov.io/gh/eliaskosunen/scnlib)[![Latest Release](https://img.shields.io/github/v/release/eliaskosunen/scnlib?sort=semver&display_name=tag)](https://github.com/eliaskosunen/scnlib/releases)
[![License](https://img.shields.io/github/license/eliaskosunen/scnlib.svg)](https://github.com/eliaskosunen/scnlib/blob/master/LICENSE)
[![C++ Standard](https://img.shields.io/badge/C%2B%2B-17%2F20%2F23-blue.svg)](https://img.shields.io/badge/C%2B%2B-17%2F20%2F23-blue.svg)
[![Documentation](https://img.shields.io/badge/Documentation-scnlib.dev-blue)](https://scnlib.dev)```cpp
#include
#include // for std::println (C++23)int main() {
// Read two integers from stdin
// with an accompanying message
if (auto result =
scn::prompt("What are your two favorite numbers? ", "{} {}")) {
auto [a, b] = result->values();
std::println("Oh, cool, {} and {}!", a, b);
} else {
std::println(stderr, "Error: {}", result.error().msg());
}
}
```Try out in [Compiler Explorer](https://godbolt.org/z/oG71eorvE).
## What is this?
`scnlib` is a modern C++ library for replacing `scanf` and `std::istream`.
This library attempts to move us ever so much closer to replacing `iostream`s
and C `stdio` altogether.
It's faster than `iostream` (see Benchmarks), and type-safe, unlike `scanf`.
Think [{fmt}](https://github.com/fmtlib/fmt) or C++20 `std::format`, but in the
other direction.This library is the reference implementation of the ISO C++ standards proposal
[P1729 "Text Parsing"](https://wg21.link/p1729).## Documentation
The documentation can be found online, at https://scnlib.dev.
To build the docs yourself, build the `scn_docs` target generated by CMake.
These targets are generated only if the variable `SCN_DOCS` is set in CMake
(done automatically if scnlib is the root project).
The `scn_docs` target requires Doxygen, Python 3.8 or better, and the `pip3`
package `poxy`.## Examples
See more examples in the `examples/` folder.
### Reading a `std::string`
```cpp
#include
#includeint main() {
// Reading a std::string will read until the first whitespace character
if (auto result = scn::scan("Hello world!", "{}")) {
// Will output "Hello":
// Access the read value with result->value()
std::println("{}", result->value());
// Will output " world":
// result->range() returns a subrange containing the unused input
// C++23 is required for the std::string_view range constructor used below
std::println("{}", std::string_view{result->range()});
} else {
std::println("Couldn't parse a word: {}", result.error().msg());
}
}
```### Reading multiple values
```cpp
#includeint main() {
auto input = std::string{"123 456 foo"};
auto result = scn::scan(input, "{} {}");
// result == true
// result->range(): " foo"
// All read values can be accessed through a tuple with result->values()
auto [a, b] = result->values();
// Read from the remaining input
// Could also use scn::ranges::subrange{result->begin(), result->end()} as input
auto result2 = scn::scan(result->range(), "{}");
// result2 == true
// result2->range().empty() == true
// result2->value() == "foo"
}
```### Reading from a fancier range
```cpp
#include#include
int main() {
auto result = scn::scan("123" | std::views::reverse, "{}");
// result == true
// result->begin() is an iterator into a reverse_view
// result->range() is empty
// result->value() == 321
}
```### Repeated reading
```cpp
#include
#includeint main() {
std::vector vec{};
auto input = scn::ranges::subrange{"123 456 789"sv};
while (auto result = scn::scan(input), "{}")) {
vec.push_back(result->value());
input = result->range();
}
}
```## Features
* Blazing fast parsing of values (see Benchmarks)
* Modern C++ interface, featuring
* type safety (variadic templates, types not determined by the format
string)
* convenience (ranges)
* ergonomics (values returned from `scn::scan`, no output parameters)
* `"{python}"`-like format string syntax
* Including compile-time format string checking
* Minimal code size increase (in user code, see Benchmarks)
* Usable without exceptions, RTTI, or ``s
* Configurable through build flags
* Limited functionality if enabled
* Supports, and requires Unicode (input is UTF-8, UTF-16, or UTF-32)
* Highly portable
* Tested on multiple platforms, see CI
* Works on multiple architectures, tested on x86, x86-64, arm, aarch64,
riscv64, ppc64le, and riscv64## Installing
`scnlib` uses CMake.
If your project already uses CMake, integration should be trivial, through
whatever means you like:
`make install` + `find_package`, `FetchContent`, `git submodule` + `add_subdirectory`,
or something else.There are community-maintained packages available
on [Conan](https://conan.io/center/recipes/scnlib) and
on [vcpkg](https://github.com/microsoft/vcpkg/tree/master/ports/scnlib).The `scnlib` CMake target is `scn::scn`
```cmake
# Target with which you'd like to use scnlib
add_executable(my_program ...)
target_link_libraries(my_program scn::scn)
```See docs for usage without CMake.
## Compiler support
A C++17-compatible compiler is required. The following compilers are tested in
CI:* GCC 7 and newer
* Clang 8 and newer
* Visual Studio 2019 and 2022Including the following environments:
* 32-bit and 64-bit builds on Windows
* libc++ on Linux
* gcc on Alpine Linux
* AppleClang and gcc on macOS 12 (Monterey) and 14 (Sonoma)
* clang-cl with VS 2019 and 2022
* MinGW and MSys2
* GCC on armv6, armv7, aarch64, riscv64, s390x, and ppc64le
* Visual Studio 2022, cross-compiling to arm64## Benchmarks
### Run-time performance
All times below are in nanoseconds of CPU time.
Lower is better.#### Integer parsing (`int`)
![Integer result, chart](benchmark/runtime/results/int.png)
| Test | Test 1 `"single"` | Test 2 `"repeated"` | Test average |
|:---------------------------------|------------------:|--------------------:|-------------:|
| `scn::scan` | 23.8 | 30.4 | 27.1 |
| `scn::scan_value` | 20.5 | 27.4 | 24.0 |
| `scn::scan_int` | 16.5 | 24.1 | 20.3 |
| `scn::scan_int_exhaustive_valid` | 4.08 | - | 4.08 |
| `std::stringstream` | 117 | 53.9 | 85.5 |
| `sscanf` | 71.3 | 474 | 272.7 |
| `strtol` | 16.3 | 23.8 | 20.1 |
| `std::from_chars` | 8.73 | 13.0 | 10.9 |
| `fast_float::from_chars` | 6.87 | 11.8 | 9.35 |#### Floating-point number parsing (`double`)
![Float result, chart](benchmark/runtime/results/float.png)
| Test | Test 1 `"single"` | Test 2 `"repeated"` | Test Average |
|:-------------------------|------------------:|--------------------:|-------------:|
| `scn::scan` | 55.8 | 63.7 | 59.7 |
| `scn::scan_value` | 52.1 | 58.8 | 55.5 |
| `std::stringstream` | 294 | 271 | 283 |
| `sscanf` | 159 | 704 | 432 |
| `strtod` | 79.1 | 153 | 116 |
| `std::from_chars` | 18.0 | 28.1 | 23.0 |
| `fast_float::from_chars` | 20.6 | 27.8 | 24.2 |#### String "word" (whitespace-separated character sequence) parsing (`string` and `string_view`)
![String result, chart](benchmark/runtime/results/string.png)
| Test | |
|:-------------------------------|-----:|
| `scn::scan` | 24.5 |
| `scn::scan` | 22.2 |
| `scn::scan_value` | 23.1 |
| `scn::scan_value` | 21.0 |
| `std::stringstream` | 134 |
| `sscanf` | 58.4 |#### Conclusions
* `scn::scan` is always faster than using `stringstream`s and `sscanf`
* `std::from_chars`/`fast_float::from_chars` is faster than `scn::scan`, but it
supports fewer features
* `strtod` is slower than `scn::scan`, and supports fewer features
* `scn::scan_value` is slightly faster compared to `scn::scan`
* `scn::scan_int` is faster than both `scn::scan` and `scn::scan_value`
* `strtol` is ~on-par with `scn::scan_int`.
* `scn::scan_int_exhaustive_valid` is blazing-fast.#### About
Above,
* "Test 1" refers to scanning a single value from a string,
which only contains the text representation for that value.
The time used for creating any state needed for the scanner is included,
for example, constructing a `stringstream`. This test is called `"single"` in
the benchmark sources.
* "Test 2" refers to the average time of scanning a value,
which contains multiple values in their text representations, separated by
spaces. The time used for creating any state needed for the scanner
is not included. This test is called `"repeated"` in the benchmark sources.
* The string test is an exception: strings are read one after another from a
sample of Lorem Ipsum.The difference between "Test 1" and "Test 2" is most pronounced when using
a `stringstream`, which is relatively expensive to construct,
and seems to be adding around ~50ns of runtime.
With `sscanf`, it seems like using the `%n` specifier and skipping whitespace
are really expensive (~400ns of runtime).
With `scn::scan` and `std::from_chars`, there's really no state to construct,
and the results for "Test 1" and "Test 2" are thus quite similar.These benchmarks were run on a Fedora 40 machine, running the Linux kernel version
6.8.9, with an AMD Ryzen 7 5700X processor, and compiled with clang version 18.1.1,
with `-O3 -DNDEBUG -march=haswell` and LTO enabled.
These benchmarks were run on 2024-05-23 (commit 3fd830de).The source code for these benchmarks can be found in the `benchmark` directory.
You can run these benchmarks yourself by enabling the CMake
variable `SCN_BENCHMARKS`.
This variable is `ON` by default, if `scnlib` is the root CMake project,
and `OFF` otherwise.```sh
$ cd build
$ cmake -DSCN_BENCHMARKS=ON \
-DCMAKE_BUILD_TYPE=Release -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON \
-DSCN_USE_HASWELL_ARCH=ON ..
$ cmake --build .
# choose benchmarks to run in ./benchmark/runtime/*/*_bench
$ ./benchmark/runtime/integer/scn_int_bench
```### Executable size
All sizes below are in kibibytes (KiB), measuring the compiled executable.
"Stripped size" shows the size of the executable after running `strip`.
Lower is better.#### Release build (`-O3 -DNDEBUG` + LTO)
![Release result, chart](benchmark/binarysize/graph-release.png)
Size of `scnlib` shared library (`.so`): 1.7M
| Method | Executable size | Stripped size |
| :------------- | --------------: | ------------: |
| empty | 7.6 | 4.4 |
| `std::scanf` | 10.4 | 5.8 |
| `std::istream` | 11.1 | 6.2 |
| `scn::input` | 11.2 | 6.4 |#### Minimized (MinSizeRel) build (`-Os -DNDEBUG` + LTO)
![MinSizeRel result, chart](benchmark/binarysize/graph-minsizerel.png)
Size of `scnlib` shared library (`.so`): 1.1M
| Method | Executable size | Stripped size |
| :------------- | --------------: | ------------: |
| empty | 7.5 | 4.4 |
| `std::scanf` | 10.3 | 5.8 |
| `std::istream` | 11.0 | 6.1 |
| `scn::input` | 12.4 | 6.6 |#### Debug build (`-g -O0`)
![Debug result, chart](benchmark/binarysize/graph-debug.png)
Size of `scnlib` shared library (`.so`): 20M
| Method | Executable size | Stripped size |
| :------------- | --------------: | ------------: |
| empty | 18.4 | 5.2 |
| `std::scanf` | 429 | 11.8 |
| `std::istream` | 438 | 9.4 |
| `scn::input` | 2234 | 51.3 |#### Conclusions
When using optimized builds, depending on compiler flags, scnlib provides a
binary, the size of which is within ~5% of what would be produced with `scanf`
or ``s.
In a Debug-environment, scnlib is ~5x bigger when compared to `scanf`
or ``. After `strip`ping the binaries,
these differences largely go away, except in Debug builds.#### About
In these tests, 25 translation units are generated, in all of which values are
read from `stdin` five times.
This is done to simulate a small project.
`scnlib` is linked dynamically, to level the playing field with the standard
library, which is also dynamically linked.The code was compiled on Fedora 40, with GCC 14.1.1.
See the directory `benchmark/binarysize` for the source code.You can run these benchmarks yourself by enabling the CMake
variable `SCN_BENCHMARKS_BINARYSIZE`.
This variable is `ON` by default, if `scnlib` is the root CMake project,
and `OFF` otherwise.```sh
$ cd build
# For Debug
$ cmake -DCMAKE_BUILD_TYPE=Debug \
-DSCN_BENCHMARKS_BINARYSIZE=ON \
-DBUILD_SHARED_LIBS=ON ..
# For Release and MinSizeRel,
# add -DCMAKE_BUILD_TYPE=$BUILD_TYPE and
# -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON$ cmake --build .
$ ./benchmark/binarysize/run_binarysize_bench.py ./benchmark/binarysize $BUILD_TYPE
```### Build time
#### Build time
Time is in seconds of CPU time (user time + sys/kernel time).
Lower is better.| Method | Debug | Release |
|:-------------|------:|--------:|
| empty | 0.05 | 0.05 |
| `scanf` | 0.22 | 0.20 |
| `` | 0.28 | 0.27 |
| `scn::input` | 0.54 | 0.45 |#### Memory consumption
Memory is in mebibytes (MiB) used while compiling.
Lower is better.| Method | Debug | Release |
|:-------------|------:|--------:|
| empty | 21.0 | 23.3 |
| `scanf` | 56.3 | 53.6 |
| `` | 67.8 | 65.0 |
| `scn::input` | 102 | 91.0 |#### Conclusions
Code using scnlib takes around 2x longer to compile compared to ``,
and also uses around 1.5x more memory.
Release builds seem to be slightly faster as compared to Debug builds.#### About
These tests measure the time it takes to compile a binary when using different
libraries.
The time taken to compile the library itself is not taken into account
(the standard library is precompiled, anyway).These tests were run on a Fedora 40 machine, with an AMD Ryzen 7 5700X
processor, using GCC version 14.1.1.
The compiler flags used for a Debug build were `-g`, and `-O3 -DNDEBUG` for a
Release build.You can run these benchmarks yourself by enabling the CMake
variable `SCN_BENCHMARKS_BUILDTIME`.
This variable is `ON` by default, if `scnlib` is the root CMake project,
and `OFF` otherwise.
For these tests to work, `c++` must point to a GCC-compatible C++
compiler binary,
and a somewhat POSIX-compatible `/usr/bin/time` must be available.```sh
$ cd build
$ cmake -DSCN_BENCMARKS_BUILDTIME=ON ..
$ cmake --build .
$ ./benchmark/buildtime/run-buildtime-tests.sh
```## Acknowledgements
The contents of this library are heavily influenced by {fmt} and its derivative
works.
https://github.com/fmtlib/fmtThe design of this library is also inspired by the Python `parse` library:
https://github.com/r1chardj0n3s/parse### Third-party libraries
*fast_float* for floating-point number parsing:
https://github.com/fastfloat/fast_float*NanoRange* for a minimal `` implementation:
https://github.com/tcbrindle/NanoRange## License
scnlib is licensed under the Apache License, version 2.0.
Copyright (c) 2017 Elias Kosunen
See LICENSE for further details.