{"id":20051911,"url":"https://github.com/p-ranav/csv2","last_synced_at":"2025-04-04T10:08:04.768Z","repository":{"id":38214981,"uuid":"256397891","full_name":"p-ranav/csv2","owner":"p-ranav","description":"Fast CSV parser and writer for Modern C++","archived":false,"fork":false,"pushed_at":"2023-12-23T11:17:07.000Z","size":746,"stargazers_count":586,"open_issues_count":20,"forks_count":101,"subscribers_count":18,"default_branch":"master","last_synced_at":"2025-03-28T09:07:10.023Z","etag":null,"topics":["blazing-fast","comma-separated-values","cpp11","csv","csv-parser","csv-reader","csv-writer","header-library","header-only","iterators","lazy-evaluation","line-reader","memory-mapped-file","mit-license","mmap","modern-cpp","single-header","single-header-lib","single-threaded","string-parsing"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/p-ranav.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2020-04-17T04:06:45.000Z","updated_at":"2025-03-26T09:09:18.000Z","dependencies_parsed_at":"2024-01-31T09:03:21.176Z","dependency_job_id":null,"html_url":"https://github.com/p-ranav/csv2","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p-ranav%2Fcsv2","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p-ranav%2Fcsv2/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p-ranav%2Fcsv2/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/p-ranav%2Fcsv2/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/p-ranav","download_url":"https://codeload.github.com/p-ranav/csv2/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247157283,"owners_count":20893220,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["blazing-fast","comma-separated-values","cpp11","csv","csv-parser","csv-reader","csv-writer","header-library","header-only","iterators","lazy-evaluation","line-reader","memory-mapped-file","mit-license","mmap","modern-cpp","single-header","single-header-lib","single-threaded","string-parsing"],"created_at":"2024-11-13T12:07:49.904Z","updated_at":"2025-04-04T10:08:04.743Z","avatar_url":"https://github.com/p-ranav.png","language":"C++","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg height=\"75\" src=\"img/logo.png\" alt=\"csv2\"/\u003e\n\u003c/p\u003e\n\n## Table of Contents\n\n*    [CSV Reader](#csv-reader)\n     *    [Performance Benchmark](#performance-benchmark)\n     *    [Reader API](#reader-api)\n*    [CSV Writer](#csv-writer)\n     *    [Writer API](#writer-api)\n*    [Compiling Tests](#compiling-tests)\n*    [Generating Single Header](#generating-single-header)\n*    [Contributing](#contributing)\n*    [License](#license)\n\n## CSV Reader\n\n```cpp\n#include \u003ccsv2/reader.hpp\u003e\n\nint main() {\n  csv2::Reader\u003ccsv2::delimiter\u003c','\u003e, \n               csv2::quote_character\u003c'\"'\u003e, \n               csv2::first_row_is_header\u003ctrue\u003e,\n               csv2::trim_policy::trim_whitespace\u003e csv;\n               \n  if (csv.mmap(\"foo.csv\")) {\n    const auto header = csv.header();\n    for (const auto row: csv) {\n      for (const auto cell: row) {\n        // Do something with cell value\n        // std::string value;\n        // cell.read_value(value);\n      }\n    }\n  }\n}\n```\n\n### Performance Benchmark\n\nThis benchmark measures the average execution time (of 5 runs after 3 warmup runs) for `csv2` to memory-map the input CSV file and iterate over every cell in the CSV. See `benchmark/main.cpp` for more details.\n\n```bash\ncd benchmark\ng++ -I../include -O3 -std=c++11 -o main main.cpp\n./main \u003ccsv_file\u003e\n```\n\n#### System Details\n\n| Type            | Value                                                                                                     |\n| --------------- | --------------------------------------------------------------------------------------------------------- |\n| Processor       | 11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz   3.50 GHz                                                |\n| Installed RAM   | 32.0 GB (31.9 GB usable)                                                                                  |\n| SSD             | [ADATA SX8200PNP](https://www.adata.com/upload/downloadfile/Datasheet_XPG%20SX8200%20Pro_EN_20181017.pdf) |\n| OS              | Ubuntu 20.04 LTS running on WSL in Windows 11                                                             |\n| C++ Compiler    | g++ (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0                                                                 |\n\n#### Results (as of 23 SEP 2022)\n\n| Dataset | File Size | Rows | Cols | Time |\n|:---     |       ---:|  ---:|  ---:|  ---:|\n| [Denver Crime Data](https://www.kaggle.com/paultimothymooney/denver-crime-data) | 111 MB | 479,100 | 19 | 0.102s |\n| [AirBnb Paris Listings](https://www.kaggle.com/juliatb/airbnb-paris) | 196 MB | 141,730 | 96 | 0.170s |\n| [2015 Flight Delays and Cancellations](https://www.kaggle.com/usdot/flight-delays) | 574 MB | 5,819,079 | 31 | 0.603s |\n| [StackLite: Stack Overflow questions](https://www.kaggle.com/stackoverflow/stacklite) | 870 MB | 17,203,824 | 7 | 0.911s |\n| [Used Cars Dataset](https://www.kaggle.com/austinreese/craigslist-carstrucks-data) | 1.4 GB | 539,768 | 25 | 0.947s |\n| [Title-Based Semantic Subject Indexing](https://www.kaggle.com/hsrobo/titlebased-semantic-subject-indexing) | 3.7 GB | 12,834,026 | 4 |2.867s|\n| [Bitcoin tweets - 16M tweets](https://www.kaggle.com/alaix14/bitcoin-tweets-20160101-to-20190329) | 4 GB | 47,478,748 | 9 | 3.290s |\n| [DDoS Balanced Dataset](https://www.kaggle.com/devendra416/ddos-datasets) | 6.3 GB | 12,794,627 | 85 | 6.963s |\n| [Seattle Checkouts by Title](https://www.kaggle.com/city-of-seattle/seattle-checkouts-by-title) | 7.1 GB | 34,892,623 | 11 | 7.698s |\n| [SHA-1 password hash dump](https://www.kaggle.com/urvishramaiya/have-i-been-pwnd) | 11 GB | 2,62,974,241 | 2 | 10.775s |\n| [DOHUI NOH scaled_data](https://www.kaggle.com/seaa0612/scaled-data) | 16 GB | 496,782 | 3213 | 16.553s |\n\n### Reader API\n\nHere is the public API available to you:\n\n```cpp\ntemplate \u003cclass delimiter = delimiter\u003c','\u003e, \n          class quote_character = quote_character\u003c'\"'\u003e,\n          class first_row_is_header = first_row_is_header\u003ctrue\u003e,\n          class trim_policy = trim_policy::trim_whitespace\u003e\nclass Reader {\npublic:\n  \n  // Use this if you'd like to mmap and read from file\n  bool mmap(string_type filename);\n\n  // Use this if you have the CSV contents in std::string already\n  bool parse(string_type contents);\n\n  // Shape\n  size_t rows() const;\n  size_t cols() const;\n  \n  // Row iterator\n  // If first_row_is_header, row iteration will start\n  // from the second row\n  RowIterator begin() const;\n  RowIterator end() const;\n\n  // Access the first row of the CSV\n  Row header() const;\n};\n```\n\nHere's the `Row` class:\n\n```cpp\n// Row class\nclass Row {\npublic:\n  // Get raw contents of the row\n  void read_raw_value(Container\u0026 value) const;\n  \n  // Cell iterator\n  CellIterator begin() const;\n  CellIterator end() const;\n};\n```\n\nand here's the `Cell` class:\n\n```cpp\n// Cell class\nclass Cell {\npublic:\n  // Get raw contents of the cell\n  void read_raw_value(Container\u0026 value) const;\n  \n  // Get converted contents of the cell\n  // Handles escaped content, e.g., \n  // \"\"\"foo\"\"\" =\u003e \"\"foo\"\"\n  void read_value(Container\u0026 value) const;\n};\n```\n\n## CSV Writer\n\nThis library also provides a basic `csv2::Writer` class - one that can be used to write CSV rows to file. Here's a basic usage:\n\n```cpp\n#include \u003ccsv2/writer.hpp\u003e\n#include \u003cvector\u003e\n#include \u003cstring\u003e\nusing namespace csv2;\n\nint main() {\n    std::ofstream stream(\"foo.csv\");\n    Writer\u003cdelimiter\u003c','\u003e\u003e writer(stream);\n\n    std::vector\u003cstd::vector\u003cstd::string\u003e\u003e rows = \n        {\n            {\"a\", \"b\", \"c\"},\n            {\"1\", \"2\", \"3\"},\n            {\"4\", \"5\", \"6\"}\n        };\n\n    writer.write_rows(rows);\n    stream.close();\n}\n```\n\n### Writer API\n\nHere is the public API available to you:\n\n```cpp\ntemplate \u003cclass delimiter = delimiter\u003c','\u003e\u003e\nclass Writer {\npublic:\n  \n  // Construct using an std::ofstream\n  Writer(output_file_stream stream);\n\n  // Use this to write a single row to file\n  void write_row(container_of_strings row);\n\n  // Use this to write a list of rows to file\n  void write_rows(container_of_rows rows);\n```\n\n## Compiling Tests\n\n```bash\nmkdir build \u0026\u0026 cd build\ncmake -DCSV2_BUILD_TESTS=ON ..\nmake\ncd test\n./csv2_test\n```\n\n## Generating Single Header\n\n```bash\npython3 utils/amalgamate/amalgamate.py -c single_include.json -s .\n```\n\n## Contributing\nContributions are welcome, have a look at the [CONTRIBUTING.md](CONTRIBUTING.md) document for more information.\n\n## License\nThe project is available under the [MIT](https://opensource.org/licenses/MIT) license.\n","funding_links":[],"categories":["CSV","Uncategorized","Data Formats"],"sub_categories":["Uncategorized"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fp-ranav%2Fcsv2","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fp-ranav%2Fcsv2","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fp-ranav%2Fcsv2/lists"}