{"id":18031771,"url":"https://github.com/dagronf/csvlib","last_synced_at":"2025-03-27T05:30:59.292Z","repository":{"id":149491686,"uuid":"152945488","full_name":"dagronf/csvlib","owner":"dagronf","description":"CSV parser for C++/Objc","archived":false,"fork":false,"pushed_at":"2022-05-31T00:52:33.000Z","size":1505,"stargazers_count":3,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-23T04:31:41.575Z","etag":null,"topics":["c-plus-plus","csv","csv-parser","objective-c","swift","tsv","tsv-parser"],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dagronf.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-10-14T06:21:20.000Z","updated_at":"2023-12-05T05:57:02.000Z","dependencies_parsed_at":"2023-05-01T05:18:58.001Z","dependency_job_id":null,"html_url":"https://github.com/dagronf/csvlib","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dagronf%2Fcsvlib","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dagronf%2Fcsvlib/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dagronf%2Fcsvlib/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dagronf%2Fcsvlib/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dagronf","download_url":"https://codeload.github.com/dagronf/csvlib/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245791341,"owners_count":20672665,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c-plus-plus","csv","csv-parser","objective-c","swift","tsv","tsv-parser"],"created_at":"2024-10-30T10:10:43.708Z","updated_at":"2025-03-27T05:30:59.286Z","avatar_url":"https://github.com/dagronf.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Simple CSV/TSV parser in C++\n\n## Why another CSV parser?\n\nHave you ever just wanted or needed a CSV or TSV parser that :\n\n1. Doesn't need to have major dependencies on large libraries like Boost?\n2. Written in C++, but can be bound to other languages like Objective-C and Swift with minimal dependencies?\n3. Handles fields containing line breaks?\n4. Supports different line endings like, \\r, \\n and \\r\\n?\n4. Can handle CSV files containing mixtures of quoted and unquoted fields?\n4. Handles files or strings in different encodings? (using ICU)\n5. Can choose to ignore blank lines?\n6. Simple callback interface on a per field and/or record basis?\n7. Simple progress support for user feedback?\n\nWell, here's another one!  I was surprised how many times I needed this functionality, went looking online for a good library then discovered that it didn't handle some aspect of the csv 'standard'.  Or, it required Boost or some other large component that I didn't want to (or couldn't) include in my codebase.\n\n## Features\n\n### Encoding\n\n* UTF8 support with no additional libraries\n* Other encoding types (eg. EUR-KR) by linking against ICU libraries.  Automatic and manual encoding detection support\n* Ability to skip blank lines\n* Support for comments\n\n### Different line endings\n\nSupport for `\\r`, `\\n` and `\\r\\n`, even mixed line endings in the same file or string\n\n### Embedded line breaks\n```\nFeelings, Thoughts\n\"angry\nyet hopeful\", none\n```\n\n### Embedded quotes\n\n```\nName, Address\n\"John \"\"The Boss\"\" McMillan\", No fixed address\nFred Irish 🇮🇪, No fixed address\n```\n\nAlso allows for 'bad' quotes\n\n```\ncat, \"dog\", fi\"sh\n```\n\n### Embedded Comments\n\nDefine a comment identifier (eg `%`) to ignore lines that start with that identifier\n```\n% Climate data\n% Years 1870 to current\n1870\t1.123\t4444\n1871\t1.122\t3434\t\t\n```\n\n## Tech stuff\n\nIt attempts to be as close to [RFC 4180](https://tools.ietf.org/html/rfc4180) as possible, with some wiggle room on certain aspects (eg. support for different line endings such as `\\r`, `\\n` along with `\\r\\n` defined in the standard).  This also provides (optional) commenting support, so if your CSV file contains comments (eg. `# Report results`) this library can skip them as needed.\n\nAllows the caller to define a custom column separator, so to parse a tab-separated file instead you can set the character separator for parsing.\n\nColumn offset is returned for each field read, and row offset for each row, meaning that if you have structured data, you can (eg.) calculate a total for every field in column 2 of your dataset.\n\nUses C++ lambda callbacks to pass fields and/or entire records back to the calling process.  You only want the first 20 records in a file?  Return `false` to one of these lambda calls at any time and the parsing process completes.  Want to stop parsing when you find a particular text field in your input data?  You can do that too by returning `false` to the field callback.\n\nBy design, this library doesn't try to convert data to specific types as it is read.  Fields are purely UTF8 encoded strings when they are returned to the caller.  It is up to you and your calling code to do meaningful things with the returned data.\n\nFor larger files, the parser can take a long time to complete.  This library does not provide any support for threading, rather relies on the caller to perform threading (ie. call the parse methods on a background thread) as needed.\n\nThis library doesn't enforce columns, or 'expected' values. If the first row in your file has 10 columns and the second has only 8, then that's what you'll get.  There are no column  formatting rules, it is up to you to handle the data as it is returned.\n\nThis library is not optimized for speed (although it is pretty fast).  If you need a blindingly fast c++ csv parser I'd suggest looking [here](https://github.com/ben-strasser/fast-cpp-csv-parser).\n\n## Support for ICU (International Components for Unicode)\n\nSee more about [ICU here](http://site.icu-project.org).\n\nIf you need to be able to read CSV or TSV files in C++ that (potentially) are encoded for a different language then this library also has you covered.  To get basic UTF-8 functionality you don't need ICU, however if you link against the ICU libraries you can use the full functionality of ICU to read other CSV file encodings.\n\nUsing ICU allows the parser to attempt to guess the encoding of a text file automatically, or you can pass the encoding as a parameter to the `parse` call.\n\n## Examples\n\n### C++ UTF-8\n\n#### Read all records in a file, returning each field along with the complete records as the file is read \n\n```cpp\ncsv::datasource::utf8::FileDataSource input;\n\nif (!input.open(\"\u003csome-csv-file\u003e.csv\")) {\n   assert(false);\n}\n\ncsv::parse(input,\n   [](const csv::field\u0026 field) -\u003e bool {\n      // Do something with 'field'\n      return true;\n   },\n   [](const csv::record\u0026 record, double progress) -\u003e bool {\n      // Do something with 'record'\n      return true;\n   }\n);\n\n```\n\n#### Read only the first 20 records in a file\n```cpp\ncsv::datasource::utf8::FileDataSource input;\nif (!input.open(\"\u003csome-csv-file\u003e.csv\")) {\n   assert(false);\n}\n\ncsv::parse(input, NULL,\n   [](const csv::record\u0026 record, double progress) -\u003e bool {\n      // The row count starts at 0, thus the last record we want to read is 20 - 1\n      return record.row \u003c 19;\n   }\n);\n```\n\n#### Use ICU to read CSV from a file with unknown encoding (EUC-KR)\n\n(Requires linking against the appropriate ICU libraries and setting `ALLOW_ICU_EXTENSIONS` preprocessor directive)\n\n```cpp\ncsv::datasource::icu::FileParser parser;\nif (!parser.open(\"/tmp/korean.csv\", NULL)) {\n   assert(false);\n}\n\ncsv::parse(parser,\n   [](const csv::field\u0026 field) -\u003e bool {\n      std::cout \u003c\u003c \"* Field: (\" \u003c\u003c field.column \u003c\u003c \" : \" \u003c\u003c field.content \u003c\u003c \")\" \u003c\u003c std::endl;\n      return true;\n   },\n   [](const csv::record\u0026 record, double progress) -\u003e bool {\n      return true;\n   }\n);\n```\n\n\n### Objective-C Example\n\n#### Read CSV row data from an NSString\n\n```objc\n[DSFCSVParser parseUTF8String:@\"cat, dog, fish\\rwhale, narwhal, swordfish\"\n                fieldCallback:^BOOL(const NSUInteger row, const NSUInteger column, const NSString* field) {\n                   NSLog(@\"Field:\\n%@\", field);\n                   return YES;\n                } \n                recordCallback:^BOOL(const NSUInteger row, const NSArray\u003cNSString*\u003e* record, CGFloat progress) {\n                   NSLog(@\"Progress: %lf, Record:\\n%@\", progress, record);\n                   return YES;\n                }];\n```\n\n#### Read string data from an NSData object using CoreFoundation to infer the encoding\n\n(CoreFoundation has the ability to convert from NSData to NSString by inferring the encoding without the need to link against ICU which is quite nice.)\n\n```objc\nNSURL* fileURL = \u003csome file URL\u003e\nNSData* data = [NSData dataWithContentsOfURL:url];\n[DSFCSVParser parseData:data\n          fieldCallback:^BOOL(const NSUInteger row, const NSUInteger column, const NSString* field) {\n              NSLog(@\"Field:\\n%@\", field);\n              return YES;\n          }\n          recordCallback:^BOOL(const NSUInteger row, const NSArray\u003cNSString*\u003e* record, CGFloat progress) {\n              NSLog(@\"Progress: %lf, Record:\\n%@\", progress, record);\n              return YES;\n          }];\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdagronf%2Fcsvlib","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdagronf%2Fcsvlib","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdagronf%2Fcsvlib/lists"}