{"id":27636989,"url":"https://github.com/dafiliks/simpcsv","last_synced_at":"2025-04-23T21:15:43.883Z","repository":{"id":288673338,"uuid":"968657807","full_name":"dafiliks/simpcsv","owner":"dafiliks","description":"A simple, and lightweight CSV parsing library in C","archived":false,"fork":false,"pushed_at":"2025-04-21T23:57:16.000Z","size":25,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-23T21:15:40.071Z","etag":null,"topics":["c","csv","csv-parser"],"latest_commit_sha":null,"homepage":"https://github.com/dafiliks/simpcsv","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dafiliks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-18T13:40:46.000Z","updated_at":"2025-04-21T23:57:19.000Z","dependencies_parsed_at":"2025-04-19T09:10:50.963Z","dependency_job_id":"ca890c4d-2783-4f80-b089-907cb4be1270","html_url":"https://github.com/dafiliks/simpcsv","commit_stats":null,"previous_names":["dafiliks/simpcsv"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dafiliks%2Fsimpcsv","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dafiliks%2Fsimpcsv/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dafiliks%2Fsimpcsv/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dafiliks%2Fsimpcsv/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dafiliks","download_url":"https://codeload.github.com/dafiliks/simpcsv/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250514777,"owners_count":21443219,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c","csv","csv-parser"],"created_at":"2025-04-23T21:15:43.120Z","updated_at":"2025-04-23T21:15:43.876Z","avatar_url":"https://github.com/dafiliks.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg height=\"75\" src=\"assets/logo.png\" alt=\"simpcsv\"/\u003e\n\u003c/p\u003e\n\n## simpcsv\n\nSimpCSV is a simple and lightweight CSV parsing library in C. It is designed to be easy to use and fast enough for most needs (\u003c 5 GB CSV files).\n\nThis library does not check if:\n- The CSV file exists.\n- The CSV file is properly formatted.\n- You have enough memory to parse the CSV file.\n\nAny segmentation faults that occur are probably the result of the above.\n\n## Table of Contents\n\n*    [Example](#example)\n*    [Benchmark](#benchmark)\n*    [Docs](#docs)\n*    [Contributing](#contributing)\n*    [License](#license)\n\n## Example\n\n```c\n#include \u003cstdio.h\u003e\n\nint main(void)\n{\n    SimpCSVHandle* handle = simpcsv_open_file(\"test.csv\", '\"', ',', '\\n');\n\n    simpcsv_count_rows_and_cols(handle);\n\n    for (size_t ri = 0; ri \u003c handle-\u003em_number_of_rows; ri++)\n    {\n        for (size_t ci = 0; ci \u003c handle-\u003em_number_of_cols; ci++)\n        {\n        // do something...\n        }\n    }\n\n    simpcsv_close_file(handle);\n\n    return 0;\n}\n```\n\n## Benchmark\nThe benchmark showcases the time it takes to iterate over every cell in the data set on my system.\n\nYou can see my system specs below.\n\n| Part            | Info                                      |\n| --------------- | ----------------------------------------- |\n| Processor       | Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz  |\n| Installed RAM   | 8 GB (7.8 GB usable)                      |\n| SSD             | SanDisk Ultra II 480GB                    |\n| Swap partition  | 2GB                                       |\n| OS              | Arch Linux x86_64                         |\n| Kernel          | 6.14.2-arch1-1                            |\n\nBelow, you can see information about the data set and the time it took to iterate over it.\n\n| Dataset                                                                                                                                                                      | File Size | Time   | Rows       | Columns |\n| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | ------ | ---------- | ------- |\n| [Amazon reviews](https://www.kaggle.com/datasets/kritanjalijain/amazon-reviews?select=train.csv)                                                                             | 1.59GB    | 0.5s   | 3,600,000  | 3       |\n| [Title-Based Semantic Subject Indexing](https://www.kaggle.com/datasets/hsrobo/titlebased-semantic-subject-indexing?select=pubmed.csv)                                       | 3.99GB    | 1.35s  | 12,834,027 | 4       |\n| [eCommerce behavior data](https://www.kaggle.com/datasets/mkechinov/ecommerce-behavior-data-from-multi-category-store?select=2019-Oct.csv)                                   | 5.67GB    | 10.12s | 42,448,765 | 9       |\n| [Chess matches database (Lichess)](https://www.kaggle.com/datasets/aharon1377/lichess-games-played-in-the-first-trimester-2020?select=lichess_db_standard_rated_2020-01.csv) | 6.06GB    | 11.26s | 46,737,780 | 15      |\n| [Seattle Checkouts by Title](https://www.kaggle.com/datasets/city-of-seattle/seattle-checkouts-by-title?select=checkouts-by-title.csv)                                       | 7.65GB    | 14.95s | 34,892,624 | 11      |\n\nI was unable to test bigger CSV files as my system memory is insufficient.\n\nAs you can see from the times, the library works extremely well on smaller files.\nWhen the files get bigger, the time also increases (by a lot on my system).\nThis is why I recommend you use this library with CSV files \u003c 5GB unless your computer is more powerful than mine.\n\n## Docs\nEvery single function is annotated with comments in both the header file and the source file. The comments in the header file contain a quick brief of what each function does. The comments in the source file\ncontain more detailed documentation of the function's parameters and return value. The quick example at the top under \"Example\" also shows you the general layout of a program using SimpCSV and how to iterate through all the cells in a CSV file.\n\n## Contributing\nFeel free to contribute to the project and improve it. There are no strict rules regarding contribution.\n\n## License\nThe project is available under the [MIT](https://opensource.org/licenses/MIT) license.\n\n\u003csub\u003e Copyright (C) David Filiks \u003c/sub\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdafiliks%2Fsimpcsv","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdafiliks%2Fsimpcsv","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdafiliks%2Fsimpcsv/lists"}