Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/p-ranav/csv2
Fast CSV parser and writer for Modern C++
https://github.com/p-ranav/csv2
blazing-fast comma-separated-values cpp11 csv csv-parser csv-reader csv-writer header-library header-only iterators lazy-evaluation line-reader memory-mapped-file mit-license mmap modern-cpp single-header single-header-lib single-threaded string-parsing
Last synced: about 2 months ago
JSON representation
Fast CSV parser and writer for Modern C++
- Host: GitHub
- URL: https://github.com/p-ranav/csv2
- Owner: p-ranav
- License: mit
- Created: 2020-04-17T04:06:45.000Z (about 4 years ago)
- Default Branch: master
- Last Pushed: 2023-12-23T11:17:07.000Z (6 months ago)
- Last Synced: 2024-01-31T09:08:19.773Z (5 months ago)
- Topics: blazing-fast, comma-separated-values, cpp11, csv, csv-parser, csv-reader, csv-writer, header-library, header-only, iterators, lazy-evaluation, line-reader, memory-mapped-file, mit-license, mmap, modern-cpp, single-header, single-header-lib, single-threaded, string-parsing
- Language: C++
- Homepage:
- Size: 729 KB
- Stars: 497
- Watchers: 18
- Forks: 88
- Open Issues: 19
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Lists
- awesome-cpp - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
- awesome-hpp - csv2 - ranav/csv2?style=social)](https://github.com/p-ranav/csv2/stargazers/) | Fast CSV parser and writer for Modern C++. | [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) | (Data Formats)
- awesome-cpp-cn - csv2
- fucking-awesome-cpp - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
- awesome-cpp - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
- awesome-cpp-completed - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
- awesome-cpp - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
- awesome-cpp - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
- awesome-cpp - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
- awesome-cpp-completed - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
- awesome-cpp - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
- awesome-cpp - csv2 - Fast CSV parser for modern C++. [MIT] (CSV)
README
![]()
## Table of Contents
* [CSV Reader](#csv-reader)
* [Performance Benchmark](#performance-benchmark)
* [Reader API](#reader-api)
* [CSV Writer](#csv-writer)
* [Writer API](#writer-api)
* [Compiling Tests](#compiling-tests)
* [Generating Single Header](#generating-single-header)
* [Contributing](#contributing)
* [License](#license)## CSV Reader
```cpp
#includeint main() {
csv2::Reader,
csv2::quote_character<'"'>,
csv2::first_row_is_header,
csv2::trim_policy::trim_whitespace> csv;
if (csv.mmap("foo.csv")) {
const auto header = csv.header();
for (const auto row: csv) {
for (const auto cell: row) {
// Do something with cell value
// std::string value;
// cell.read_value(value);
}
}
}
}
```### Performance Benchmark
This benchmark measures the average execution time (of 5 runs after 3 warmup runs) for `csv2` to memory-map the input CSV file and iterate over every cell in the CSV. See `benchmark/main.cpp` for more details.
```bash
cd benchmark
g++ -I../include -O3 -std=c++11 -o main main.cpp
./main
```#### System Details
| Type | Value |
| --------------- | --------------------------------------------------------------------------------------------------------- |
| Processor | 11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz 3.50 GHz |
| Installed RAM | 32.0 GB (31.9 GB usable) |
| SSD | [ADATA SX8200PNP](https://www.adata.com/upload/downloadfile/Datasheet_XPG%20SX8200%20Pro_EN_20181017.pdf) |
| OS | Ubuntu 20.04 LTS running on WSL in Windows 11 |
| C++ Compiler | g++ (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 |#### Results (as of 23 SEP 2022)
| Dataset | File Size | Rows | Cols | Time |
|:--- | ---:| ---:| ---:| ---:|
| [Denver Crime Data](https://www.kaggle.com/paultimothymooney/denver-crime-data) | 111 MB | 479,100 | 19 | 0.102s |
| [AirBnb Paris Listings](https://www.kaggle.com/juliatb/airbnb-paris) | 196 MB | 141,730 | 96 | 0.170s |
| [2015 Flight Delays and Cancellations](https://www.kaggle.com/usdot/flight-delays) | 574 MB | 5,819,079 | 31 | 0.603s |
| [StackLite: Stack Overflow questions](https://www.kaggle.com/stackoverflow/stacklite) | 870 MB | 17,203,824 | 7 | 0.911s |
| [Used Cars Dataset](https://www.kaggle.com/austinreese/craigslist-carstrucks-data) | 1.4 GB | 539,768 | 25 | 0.947s |
| [Title-Based Semantic Subject Indexing](https://www.kaggle.com/hsrobo/titlebased-semantic-subject-indexing) | 3.7 GB | 12,834,026 | 4 |2.867s|
| [Bitcoin tweets - 16M tweets](https://www.kaggle.com/alaix14/bitcoin-tweets-20160101-to-20190329) | 4 GB | 47,478,748 | 9 | 3.290s |
| [DDoS Balanced Dataset](https://www.kaggle.com/devendra416/ddos-datasets) | 6.3 GB | 12,794,627 | 85 | 6.963s |
| [Seattle Checkouts by Title](https://www.kaggle.com/city-of-seattle/seattle-checkouts-by-title) | 7.1 GB | 34,892,623 | 11 | 7.698s |
| [SHA-1 password hash dump](https://www.kaggle.com/urvishramaiya/have-i-been-pwnd) | 11 GB | 2,62,974,241 | 2 | 10.775s |
| [DOHUI NOH scaled_data](https://www.kaggle.com/seaa0612/scaled-data) | 16 GB | 496,782 | 3213 | 16.553s |### Reader API
Here is the public API available to you:
```cpp
template ,
class quote_character = quote_character<'"'>,
class first_row_is_header = first_row_is_header,
class trim_policy = trim_policy::trim_whitespace>
class Reader {
public:
// Use this if you'd like to mmap and read from file
bool mmap(string_type filename);// Use this if you have the CSV contents in std::string already
bool parse(string_type contents);// Shape
size_t rows() const;
size_t cols() const;
// Row iterator
// If first_row_is_header, row iteration will start
// from the second row
RowIterator begin() const;
RowIterator end() const;// Access the first row of the CSV
Row header() const;
};
```Here's the `Row` class:
```cpp
// Row class
class Row {
public:
// Get raw contents of the row
void read_raw_value(Container& value) const;
// Cell iterator
CellIterator begin() const;
CellIterator end() const;
};
```and here's the `Cell` class:
```cpp
// Cell class
class Cell {
public:
// Get raw contents of the cell
void read_raw_value(Container& value) const;
// Get converted contents of the cell
// Handles escaped content, e.g.,
// """foo""" => ""foo""
void read_value(Container& value) const;
};
```## CSV Writer
This library also provides a basic `csv2::Writer` class - one that can be used to write CSV rows to file. Here's a basic usage:
```cpp
#include
#include
#include
using namespace csv2;int main() {
std::ofstream stream("foo.csv");
Writer> writer(stream);std::vector> rows =
{
{"a", "b", "c"},
{"1", "2", "3"},
{"4", "5", "6"}
};writer.write_rows(rows);
stream.close();
}
```### Writer API
Here is the public API available to you:
```cpp
template >
class Writer {
public:
// Construct using an std::ofstream
Writer(output_file_stream stream);// Use this to write a single row to file
void write_row(container_of_strings row);// Use this to write a list of rows to file
void write_rows(container_of_rows rows);
```## Compiling Tests
```bash
mkdir build && cd build
cmake -DCSV2_BUILD_TESTS=ON ..
make
cd test
./csv2_test
```## Generating Single Header
```bash
python3 utils/amalgamate/amalgamate.py -c single_include.json -s .
```## Contributing
Contributions are welcome, have a look at the [CONTRIBUTING.md](CONTRIBUTING.md) document for more information.## License
The project is available under the [MIT](https://opensource.org/licenses/MIT) license.