{"id":15642066,"url":"https://github.com/jkeen/comma_splice","last_synced_at":"2025-10-08T19:10:43.398Z","repository":{"id":48833639,"uuid":"201112657","full_name":"jkeen/comma_splice","owner":"jkeen","description":"Fixes CSVs with unquoted commas in values","archived":false,"fork":false,"pushed_at":"2023-03-14T11:56:46.000Z","size":86,"stargazers_count":67,"open_issues_count":2,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-07T14:48:07.666Z","etag":null,"topics":["command-line-tool","csv","csv-converter","csv-files","csv-parser","csv-reading","ruby"],"latest_commit_sha":null,"homepage":null,"language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jkeen.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-07T19:12:33.000Z","updated_at":"2025-09-25T06:56:13.000Z","dependencies_parsed_at":"2024-10-22T19:17:43.281Z","dependency_job_id":null,"html_url":"https://github.com/jkeen/comma_splice","commit_stats":{"total_commits":47,"total_committers":4,"mean_commits":11.75,"dds":0.3191489361702128,"last_synced_commit":"21772e3f8b60ebf2bee6f874742d41e3026fc647"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/jkeen/comma_splice","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkeen%2Fcomma_splice","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkeen%2Fcomma_splice/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkeen%2Fcomma_splice/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkeen%2Fcomma_splice/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jkeen","download_url":"https://codeload.github.com/jkeen/comma_splice/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jkeen%2Fcomma_splice/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279000701,"owners_count":26082805,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-08T02:00:06.501Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["command-line-tool","csv","csv-converter","csv-files","csv-parser","csv-reading","ruby"],"created_at":"2024-10-03T11:54:05.660Z","updated_at":"2025-10-08T19:10:43.381Z","avatar_url":"https://github.com/jkeen.png","language":"Ruby","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Comma Splice\n\nThis gem tackles [one very specific problem](https://medium.com/@jeffkeen/how-to-correct-32-000-incorrect-csv-files-in-fewer-than-32-000-steps-a5f1ba25d951): when CSVs have commas in the values and the values haven't been quoted. This determines which commas separate fields and which commas are part of a value, and corrects the file.\n\nFor example, given the following CSV\n\n```\ntimestamp,artist,title,albumtitle,label\n01-27-2019 @ 12:34:00,Lester Sterling, Lynn Taitt \u0026 The Jets,Check Point Charlie,Merritone Rock Steady 3: Bang Bang Rock Steady 1966-1968,Dub Store,\n01-27-2019 @ 12:31:00,Lester Sterling,Lester Sterling Special,Merritone Rock Steady 2: This Music Got Soul 1966-1967,Dub Store,\n\n```\n\nwhich parses incorrectly as:\n\n| timestamp             | artist          | title       | albumtitle      | label                                                      |\n|-----------------------|-----------------|-------------|-----------------|------------------------------------------------------------|\n| 01-27-2019 @ 12:34:00 | Lester Sterling | Lynn Taitt \u0026 The Jets   | Check Point Charlie                                    | Merritone Rock Steady 3: Bang Bang Rock Steady 1966-1968\n| 01-27-2019 @ 12:31:00 | Lester Sterling | Lester Sterling Special | Merritone Rock Steady 2: This Music Got Soul 1966-1967 | Dub Store   |\n\n\nRunning this through `comma_splice correct /path/to/file` will return this corrected content:\n\n```\ntimestamp,artist,title,albumtitle,label\n01-27-2019 @ 12:34:00,\"Lester Sterling, Lynn Taitt \u0026 The Jets\",Check Point Charlie,Merritone Rock Steady 3: Bang Bang Rock Steady 1966-1968,Dub Store,\n01-27-2019 @ 12:31:00,Lester Sterling,Lester Sterling Special,Merritone Rock Steady 2: This Music Got Soul 1966-1967,Dub Store,\n```\n\n| timestamp             | artist          | title       | albumtitle      | label                                                      |\n|-----------------------|-----------------|-------------|-----------------|------------------------------------------------------------|\n| 01-27-2019 @ 12:34:00 | Lester Sterling, Lynn Taitt \u0026 The Jets   | Check Point Charlie | Merritone Rock Steady 3: Bang Bang Rock Steady 1966-1968 | Dub Store |\n| 01-27-2019 @ 12:31:00 | Lester Sterling | Lester Sterling Special | Merritone Rock Steady 2: This Music Got Soul 1966-1967 | Dub Store   |\n\n\nIf it can't determine where the comma should go, it prompts you for the possible options\n\n\ngiven the following CSV:\n\n```\nplayid,playtype,genre,timestamp,artist,title,albumtitle,label,prepost,programtype,iswebcast,isrequest\n16851097,,,12-09-2017 @ 09:57:00,10,000 Maniacs and Michael Stipe,To Sir with Love,Campfire Songs,Rhino,post,live,y,\n16851096,,,12-09-2017 @ 09:44:00,Fran Jeffries,Mine Eyes,Fran Can Really Hang You Up the Most,Warwick,post,live,y,\n```\n\nIt prompts:\n\n```\nWhich one of these is correct?\n\n(1)  artist    : 10\n     title     : 000 Maniacs and Michael Stipe\n     albumtitle: To Sir with Love\n     label     : \"Campfire Songs,Rhino\"\n\n(2)  artist    : 10\n     title     : 000 Maniacs and Michael Stipe\n     albumtitle: \"To Sir with Love,Campfire Songs\"\n     label     : Rhino\n\n(3)  artist    : 10\n     title     : \"000 Maniacs and Michael Stipe,To Sir with Love\"\n     albumtitle: Campfire Songs\n     label     : Rhino\n\n(4)  artist    : \"10,000 Maniacs and Michael Stipe\"\n     title     : To Sir with Love\n     albumtitle: Campfire Songs\n     label     : Rhino\n```\n\nSelect an option (4), and it returns:\n\n```\nplayid,playtype,genre,timestamp,artist,title,albumtitle,label,prepost,programtype,iswebcast,isrequest\n16851097,,,12-09-2017 @ 09:57:00,\"10,000 Maniacs and Michael Stipe\",To Sir with Love,Campfire Songs,Rhino,post,live,y,\n16851096,,,12-09-2017 @ 09:44:00,Fran Jeffries,Mine Eyes,Fran Can Really Hang You Up the Most,Warwick,post,live,y,\n```\n\n## Usage\n\nYou can use this in a ruby program by using installing the `comma_splice` gem, or you can install it on your system and use the `comma_splice` command line utility.\n\n\n##### Return the number of bad lines in a file\n\n```ruby\n  CommaSplice::FileCorrector.new(file_path).bad_lines.size\n\n  #you can specify another separator\n  CommaSplice::FileCorrector.new(file_path, separator: ';').bad_lines.size\n```\n```\n  comma_splice bad_line_count /path/to/file.csv\n```\n\n##### Display the fixed contents\n```ruby\n  CommaSplice::FileCorrector.new(file_path).corrected\n  \n  #you can specify another separator\n  CommaSplice::FileCorrector.new(file_path, separator: ';').corrected\n```\n```bash\n  comma_splice correct /path/to/file.csv\n```\n\n##### Process a file and save the fixed version\n```ruby\n  CommaSplice::FileCorrector.new(file_path).save(save_path)\n  \n  #you can specify another separator\n  CommaSplice::FileCorrector.new(file_path, separator: ';').save(save_path)\n```\n```bash\n  comma_splice fix /path/to/file.csv /path/to/save\n```\n\n## Installation\n\nAdd this line to your application's Gemfile:\n\n```ruby\ngem 'comma_splice'\n```\n\nAnd then execute:\n\n    $ bundle\n\nOr install it yourself as:\n\n    $ gem install comma_splice\n\n## Development\n\nAfter checking out the repo, run `bin/setup` to install dependencies. You can also run `bin/console` for an interactive prompt that will allow you to experiment.\n\nTo install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).\n\n## Contributing\n\nBug reports and pull requests are welcome on GitHub at https://github.com/jkeen/comma_splice.\n\n## License\n\nThe gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjkeen%2Fcomma_splice","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjkeen%2Fcomma_splice","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjkeen%2Fcomma_splice/lists"}