{"id":16679757,"url":"https://github.com/buren/honey_format","last_synced_at":"2025-03-21T18:32:26.945Z","repository":{"id":45993366,"uuid":"43524988","full_name":"buren/honey_format","owner":"buren","description":"Makes working with CSVs as smooth as honey.","archived":false,"fork":false,"pushed_at":"2024-04-16T08:58:22.000Z","size":405,"stargazers_count":14,"open_issues_count":16,"forks_count":2,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-10-13T13:36:58.661Z","etag":null,"topics":["csv","ruby","rubygem"],"latest_commit_sha":null,"homepage":"","language":"Ruby","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/buren.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2015-10-01T22:40:55.000Z","updated_at":"2023-03-09T03:02:51.000Z","dependencies_parsed_at":"2024-04-16T10:00:11.233Z","dependency_job_id":null,"html_url":"https://github.com/buren/honey_format","commit_stats":{"total_commits":315,"total_committers":2,"mean_commits":157.5,"dds":"0.0031746031746031633","last_synced_commit":"f919fa098238d102880528247ebcb94e481c1cce"},"previous_names":[],"tags_count":32,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buren%2Fhoney_format","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buren%2Fhoney_format/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buren%2Fhoney_format/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/buren%2Fhoney_format/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/buren","download_url":"https://codeload.github.com/buren/honey_format/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221817547,"owners_count":16885560,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","ruby","rubygem"],"created_at":"2024-10-12T13:37:22.067Z","updated_at":"2024-10-28T10:35:45.696Z","avatar_url":"https://github.com/buren.png","language":"Ruby","readme":"# HoneyFormat [![Build Status](https://travis-ci.org/buren/honey_format.svg)](https://travis-ci.org/buren/honey_format) [![Code Climate](https://codeclimate.com/github/buren/honey_format/badges/gpa.svg)](https://codeclimate.com/github/buren/honey_format) [![Inline docs](http://inch-ci.org/github/buren/honey_format.svg)](https://www.rubydoc.info/gems/honey_format/)\n\n\u003e Makes working with CSVs as smooth as honey.\n\nProper objects for CSV headers and rows, convert column values, filter columns and rows, small(-ish) perfomance overhead, no dependencies other than Ruby stdlib.\n\n## Features\n\n- Proper objects for CSV header and rows\n- Convert row and header column values\n- Pass your own custom row builder\n- Filter what columns and rows are included in CSV output\n- Gracefully handle missing and duplicated header columns\n- [CLI](#cli) - Simple command line interface\n- Only ~5-10% overhead from using Ruby CSV, see [benchmarks](#benchmark)\n- Has no dependencies other than Ruby stdlib\n- Supports Ruby \u003e= 2.3\n\nRead the [usage section](#usage),  [RubyDoc](https://www.rubydoc.info/gems/honey_format/) or [examples/ directory](https://github.com/buren/honey_format/tree/master/examples)  for how to use this gem.\n\n## Quick use\n\n```ruby\ncsv_string = \u003c\u003c-CSV\nId,Username,Email\n1,buren,buren@example.com\n2,jacob,jacob@example.com\nCSV\ncsv = HoneyFormat::CSV.new(csv_string, type_map: { id: :integer })\ncsv.columns     # =\u003e [:id, :username, :email]\ncsv.rows        # =\u003e [#\u003cRow id=1, username=\"buren\", email=\"buren@example.com\"\u003e, #\u003cRow id=2, username=\"jacob\", email=\"jacob@example.com\"\u003e]\nuser = csv.rows.first\nuser.id         # =\u003e 1\nuser.username   # =\u003e \"buren\"\n\ncsv.to_csv(columns: [:id, :username]) { |row| row.id \u003c 2 }\n# =\u003e \"id,username\\n1,buren\\n\"\n```\n\n## Installation\n\nAdd this line to your application's Gemfile:\n\n```ruby\ngem 'honey_format'\n```\n\nAnd then execute:\n```\n$ bundle\n```\n\nOr install it yourself as:\n```\n$ gem install honey_format\n```\n\n## Usage\n\nBy default assumes a header in the CSV file\n\n```ruby\ncsv_string = \"Id,Username\\n1,buren\"\ncsv = HoneyFormat::CSV.new(csv_string)\n\n# Header\nheader = csv.header\nheader.original # =\u003e [\"Id\", \"Username\"]\nheader.columns  # =\u003e [:id, :username]\n\n\n# Rows\nrows = csv.rows # =\u003e [#\u003cRow id=\"1\", username=\"buren\"\u003e]\nuser = rows.first\nuser.id         # =\u003e \"1\"\nuser.username   # =\u003e \"buren\"\n```\n\nSet delimiter \u0026 quote character\n```ruby\ncsv_string = \"name;id|'John Doe';42\"\ncsv = HoneyFormat::CSV.new(\n  csv_string,\n  delimiter: ';',\n  row_delimiter: '|',\n  quote_character: \"'\",\n)\n```\n\n__Type converters__\n\n\u003e Type converters are great if you want to convert column values, like numbers and dates.\n\nThere are a bunch of [default type converters](https://github.com/buren/honey_format/blob/master/lib/honey_format/converters/converters.rb)\n```ruby\ncsv_string = \"Id,Username\\n1,buren\"\ntype_map = { id: :integer }\ncsv = HoneyFormat::CSV.new(csv_string, type_map: type_map)\ncsv.rows.first.id # =\u003e 1\n```\n\nPass your own\n```ruby\ncsv_string = \"Id,Username\\n1,buren\"\ntype_map = { username: proc { |v| v.upcase } }\ncsv = HoneyFormat::CSV.new(csv_string, type_map: type_map)\ncsv.rows.first.username # =\u003e \"BUREN\"\n```\n\nCombine multiple converters\n```ruby\ncsv_string = \"Id,Username\\n1,  BuRen  \"\ntype_map = { username: [:strip, :downcase] }\ncsv = HoneyFormat::CSV.new(csv_string, type_map: type_map)\ncsv.rows.first.username # =\u003e \"buren\"\n```\n\nRegister your own converter\n```ruby\nHoneyFormat.configure do |config|\n  config.converter_registry.register :upcased, proc { |v| v.upcase }\nend\n\ncsv_string = \"Id,Username\\n1,buren\"\ntype_map = { username: :upcased }\ncsv = HoneyFormat::CSV.new(csv_string, type_map: type_map)\ncsv.rows.first.username # =\u003e \"BUREN\"\n```\n\nRemove registered converter\n```ruby\nHoneyFormat.configure do |config|\n  config.converter_registry.unregister :upcase\n  # now you're free to register your own\n  config.converter_registry.register :upcase, proc { |v| v.upcase if v }\nend\n```\n\nAccess registered converters\n```ruby\ndecimal_converter = HoneyFormat.converter_registry[:decimal]\ndecimal_converter.call('1.1') # =\u003e 1.1\n```\n\nDefault converter names\n```ruby\nHoneyFormat.config.default_converters.keys\n```\n\nSee [`Converters::DEFAULT`](https://github.com/buren/honey_format/blob/master/lib/honey_format/converters.rb) for a complete list of the default converter names.\n\n__Row builder__\n\n\u003e Pass your own row builder if you want more control of the entire row or if you want to return your own row object.\n\nCustom row builder\n```ruby\ncsv_string = \"Id,Username\\n1,buren\"\nupcaser = -\u003e(row) { row.tap { |r| r.username.upcase! } }\ncsv = HoneyFormat::CSV.new(csv_string, row_builder: upcaser)\ncsv.rows # =\u003e [#\u003cRow id=\"1\", username=\"BUREN\"\u003e]\n```\n\nAs long as the row builder responds to `#call` you can pass anything you like\n```ruby\nclass Anonymizer\n  def call(row)\n    @cache ||= {}\n    # Return an object you want to represent the row\n    row.tap do |r|\n      # given the same value make sure to return the same anonymized value every time\n      @cache[r.email] ||= \"#{SecureRandom.hex(6)}@example.com\"\n      r.email = @cache[r.email]\n      r.payment_id = '\u003cscrubbed\u003e'\n    end\n  end\nend\n\ncsv_string = \u003c\u003c~CSV\nEmail,Payment ID\nburen@example.com,123\nburen@example.com,998\nCSV\ncsv = HoneyFormat::CSV.new(csv_string, row_builder: Anonymizer.new)\ncsv.rows.to_csv(columns: [:email])\n# =\u003e 8f6ed70a7f98@example.com\n#    8f6ed70a7f98@example.com\n#    0db96f350cea@example.com\n```\n\n__Output CSV__\n\n\u003e Makes it super easy to output a subset of columns/rows.\n\nManipulate the rows before output\n```ruby\ncsv_string = \"Id,Username\\n1,buren\"\ncsv = HoneyFormat::CSV.new(csv_string)\ncsv.rows.each { |row| row.id = nil }\ncsv.to_csv # =\u003e \"id,username\\n,buren\\n\"\n```\n\nOutput a subset of columns\n```ruby\ncsv_string = \"Id, Username, Country\\n1,buren,Sweden\"\ncsv = HoneyFormat::CSV.new(csv_string)\ncsv.to_csv(columns: [:id, :country]) # =\u003e \"id,country\\nburen,Sweden\\n\"\n```\n\nOutput a subset of rows\n```ruby\ncsv_string = \"Name, Country\\nburen,Sweden\\njacob,Denmark\"\ncsv = HoneyFormat::CSV.new(csv_string)\ncsv.to_csv { |row| row.country == 'Sweden' } # =\u003e \"name,country\\nburen,Sweden\\n\"\n```\n\n__Headers__\n\n\u003e By default generates method-like names for each header column, but also gives you full control: define them or convert them.\n\nBy default assumes a header in the CSV file.\n```ruby\ncsv_string = \"Id,Username\\n1,buren\"\ncsv = HoneyFormat::CSV.new(csv_string)\n\n# Header\nheader = csv.header\nheader.original # =\u003e [\"Id\", \"Username\"]\nheader.columns  # =\u003e [:id, :username]\n```\n\nDefine header\n```ruby\ncsv_string = \"1,buren\"\ncsv = HoneyFormat::CSV.new(csv_string, header: ['Id', 'Username'])\ncsv.rows.first.username # =\u003e \"buren\"\n```\n\nSet default header converter\n```ruby\nHoneyFormat.configure do |config|\n  config.header_converter = proc { |v| v.downcase }\nend\n\n# you can get the default one with\nheader_converter = HoneyFormat.converter_registry[:header_column]\nheader_converter.call('First name') # =\u003e \"first_name\"\n```\n\nUse any converter registry as the header converter\n```ruby\ncsv_string = \"Id,Username\\n1,buren\"\ncsv = HoneyFormat::CSV.new(csv_string, header_converter: :upcase)\ncsv.columns # =\u003e [:ID, :USERNAME]\n```\n\nPass your own header converter\n```ruby\n# unmapped keys use the default header converter,\n# mix simple key =\u003e value mapping with key =\u003e proc\nconverter = {\n  'First^Name' =\u003e :first_name,\n  'Username' =\u003e -\u003e { :handle }\n}\n\ncsv_string = \"ID,Username,First^Name\\n1,buren,Jacob\"\nuser = HoneyFormat::CSV.new(csv_string, header_converter: converter).rows.first\nuser.first_name # =\u003e \"Jacob\"\nuser.handle     # =\u003e \"buren\"\nuser.id         # =\u003e \"1\"\n\n# you can also pass a proc or any callable object\nconverter = Class.new do\n  define_singleton_method(:call) { |value, index| \"#{value}#{index}\" }\nend\n# or\nconverter = -\u003e(value, index) { \"#{value}#{index}\"  }\nuser = HoneyFormat::CSV.new(csv_string, header_converter: converter)\n```\n\nMissing header values are automatically set and deduplicated\n```ruby\ncsv_string = \"first,,third,third\\nval0,val1,val2,val3\"\ncsv = HoneyFormat::CSV.new(csv_string)\nuser = csv.rows.first\nuser.column1 # =\u003e \"val1\"\nuser.third   # =\u003e \"val2\"\nuser.third1  # =\u003e \"val3\"\n```\n\nDuplicated header values\n```ruby\ncsv_string = \u003c\u003c~CSV\n  email,email,name\n  john@example.com,jane@example.com,John\nCSV\n# :deduplicate is the default value\ncsv = HoneyFormat::CSV.new(csv_string, header_deduplicator: :deduplicate)\nuser = csv.rows.first\nuser.email  # =\u003e john@example.com\nuser.email1 # =\u003e jane@example.com\n\n# you can also choose to raise an error instead\nHoneyFormat::CSV.new(csv_string, header_deduplicator: :raise)\n# =\u003e HoneyFormat::DuplicateHeaderColumnError\n```\n\nIf your header contains special chars and/or chars that can't be part of Ruby method names,\nthings can get a little awkward..\n```ruby\ncsv_string = \"ÅÄÖ\\nSwedish characters\"\nuser = HoneyFormat::CSV.new(csv_string).rows.first\n# Note that these chars aren't \"downcased\" in Ruby 2.3 and older versions of Ruby,\n# \"ÅÄÖ\".downcase # =\u003e \"ÅÄÖ\"\nuser.ÅÄÖ # =\u003e \"Swedish characters\"\n# while on Ruby \u003e 2.3\nuser.åäö\n\ncsv_string = \"First^Name\\nJacob\"\nuser = HoneyFormat::CSV.new(csv_string).rows.first\nuser.public_send(:\"first^name\") # =\u003e \"Jacob\"\n# or\nuser['first^name'] # =\u003e \"Jacob\"\n```\n\nEmoji characters\n```ruby\ncsv_string = \"😎⛷\\nEmoji characters\"\ncsv = HoneyFormat::CSV.new(csv_string)\ncsv.rows.first.😎⛷ # =\u003e Emoji characters\n```\n\n__Errors__\n\n\u003e When you need to be extra safe.\n\nIf you want to there are some errors you can rescue\n```ruby\nbegin\n  HoneyFormat::CSV.new(csv_string)\nrescue HoneyFormat::HeaderError =\u003e e\n  puts 'there was a problem with the header'\n  raise(e)\nrescue HoneyFormat::RowError =\u003e e\n  puts 'there was a problem with a row'\n  raise(e)\nend\n```\n\nYou can see all [available errors here](https://www.rubydoc.info/gems/honey_format/HoneyFormat/Errors).\n\n__Skip lines__\n\n\u003e Skip comments and/or other unwanted lines from being parsed.\n\n```ruby\ncsv_string = \u003c\u003c~CSV\nId,Username\n1,buren\n# comment\n2,jacob\nCSV\nregexp = %r{\\A#} # Match all lines that start with \"#\"\ncsv = HoneyFormat::CSV.new(csv_string, skip_lines: regexp)\ncsv.rows.length # =\u003e 2\n```\n\n__Matrix__\n\n\u003e Use whats under the hood.\n\nActually `HoneyFormat::CSV` is a very thin wrapper around `HoneyFormat::Matrix`.\nYou can use `Matrix` directly it support all options that aren't specifically tied to parsing a CSV.\n\nExample\n```ruby\ndata = [\n  %w[name id],\n  %w[jacob 1]\n]\ntype_map = {\n  id: :integer,\n  name: :upcase\n}\n\nmatrix = HoneyFormat::Matrix.new(data, type_map: { id: :integer, name: :upcase })\nmatrix.columns   # =\u003e [:name, :id]\nmatrix.rows.to_a # =\u003e [#\u003cRow name=\"JACOB\", id=1\u003e]\nmatrix.to_csv    # =\u003e \"name,id\\nJACOB,1\\n\"\n```\n\nIf you want to see more usage examples check out the [`examples/`](https://github.com/buren/honey_format/tree/master/examples) and [`spec/`](https://github.com/buren/honey_format/tree/master/spec) directories and of course [on RubyDoc](https://www.rubydoc.info/gems/honey_format/).\n\n\n__SQL example__\n\nWhen you want the result as an object, with certain columns converted to objects.\n\n```ruby\nrequire 'mysql2'\n\nclass DBClient\n  def initialize(host:, username:, password:, port: 3306)\n    @client = Mysql2::Client.new(\n      host: host,\n      username: username,\n      password: password,\n      port: port\n    )\n  end\n\n  def query(sql, type_map: {})\n    result = @client.query(sql)\n    return if result.first.nil?\n\n    matrix = HoneyFormat::Matrix.new(\n      result.map(\u0026:values),\n      header: result.first.keys,\n      type_map: type_map\n    )\n    matrix.rows\n  end\nend\n```\n\nUsage example with a fictional \"users\" database table (schema: `name`, `created_at`)\n```ruby\nclient = DbClient.new(host: '127.0.0.1', username: 'root', password: nil)\nusers = client.query(\n  'SELECT * FROM users',\n  type_map: { created_at: :datetime! }\n)\nuser = users.first\nuser.name # =\u003e buren\nuser.created_at.class # =\u003e Time\n```\n\n## Configuration\n\nConfiguration is optional\n```ruby\nHoneyFormat.configure do |config|\n  config.header_converter = proc { |column| column.downcase }\n  config.delimiter = \";\"\n  config.row_delimiter = \"|\"\n  config.quote_character = \"'\"\n  config.skip_lines = %r{\\A#} # Match all lines that start with \"#\"\nend\n```\n\nDefault configuration values\n```ruby\nHoneyFormat.configure do |config|\n  config.header_converter = HoneyFormat::Registry.new(Converters::DEFAULT)[:header_column]\n  config.delimiter = \",\"\n  config.row_delimiter = :auto\n  config.quote_character = \"\\\"\"\n  config.skip_lines = nil\nend\n```\n\n## CLI\n\n\u003e Perfect when you want to get something simple done quickly.\n\n```\nUsage: honey_format [options] \u003cfile.csv\u003e\n        --csv=input.csv              CSV file\n        --columns=id,name            Select columns\n        --output=output.csv          CSV output (STDOUT otherwise)\n        --delimiter=,                CSV delimiter (default: ,)\n        --skip-lines=,               Skip lines that match this pattern\n        --[no-]header-only           Print only the header\n        --[no-]rows-only             Print only the rows\n    -h, --help                       How to use\n        --version                    Show version\n```\n\nOutput a subset of columns to a new file\n```\n# input.csv\nid,name,username\n1,jacob,buren\n```\n\n```\n$ honey_format input.csv --columns=id,username \u003e output.csv\n```\n\n\n## Benchmark\n\n_Note_: This gem, adds some overhead to parsing a CSV string, typically ~5-10%. I've included some benchmarks below, your mileage may vary.. The benchmarks have been run with Ruby 2.5.\n\n204KB (1k lines)\n\n```\n CSV no options:       51.0 i/s\n CSV with header:      36.1 i/s - 1.41x  slower\nHoneyFormat::CSV:      48.7 i/s - 1.05x  slower\n```\n\n2MB (10k lines)\n\n```\n  CSV no options:        5.1 i/s\n CSV with header:        3.6 i/s - 1.42x  slower\nHoneyFormat::CSV:        4.9 i/s - 1.05x  slower\n```\n\nYou can run the benchmarks yourself\n```\nUsage: bin/benchmark [file.csv] [options]\n        --csv=[file1.csv]            CSV file(s)\n        --[no-]verbose               Verbose output\n        --lines-multipliers=[1,2,10] Multiply the rows in the CSV file (default: 1)\n        --time=[30]                  Benchmark time (default: 30)\n        --warmup=[5]                 Benchmark warmup (default: 5)\n    -h, --help                       How to use\n```\n\n## Development\n\nAfter checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.\n\nTo install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).\n\n## Contributing\n\nBug reports and pull requests are welcome on GitHub at https://github.com/buren/honey_format. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](contributor-covenant.org) code of conduct.\n\n\n## License\n\nThe gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).\n","funding_links":[],"categories":["Libraries \u0026 Tools"],"sub_categories":["Ruby"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fburen%2Fhoney_format","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fburen%2Fhoney_format","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fburen%2Fhoney_format/lists"}