Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nullscreen/uncsv
A parser for unruly CSVs
https://github.com/nullscreen/uncsv
csv parser ruby
Last synced: 2 months ago
JSON representation
A parser for unruly CSVs
- Host: GitHub
- URL: https://github.com/nullscreen/uncsv
- Owner: nullscreen
- License: mit
- Created: 2018-12-07T23:41:06.000Z (about 6 years ago)
- Default Branch: main
- Last Pushed: 2022-09-10T03:15:27.000Z (over 2 years ago)
- Last Synced: 2024-10-11T01:55:58.276Z (3 months ago)
- Topics: csv, parser, ruby
- Language: Ruby
- Size: 66.4 KB
- Stars: 0
- Watchers: 4
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Uncsv
[![Gem Version](https://badge.fury.io/rb/uncsv.svg)](https://badge.fury.io/rb/uncsv)
[![CI](https://github.com/nullscreen/uncsv/workflows/CI/badge.svg)](https://github.com/nullscreen/uncsv/actions?query=workflow%3ACI+branch%main)
[![Codacy Badge](https://app.codacy.com/project/badge/Grade/5c70b27a9f874f34b522c3b6589b1266)](https://www.codacy.com/gh/nullscreen/uncsv/dashboard)
[![Code Coverage](https://codecov.io/gh/nullscreen/uncsv/branch/main/graph/badge.svg?token=PVK51XUVJB)](https://codecov.io/gh/nullscreen/uncsv)A parser for unruly CSVs
Parse CSVs with heirarchical headers and duplicated headers. Skip lines by line
number, etc.## Documentation
Read below to get started, or see the [API Documentation][api-docs] for more
details.[api-docs]: https://www.rubydoc.info/github/nullscreen/uncsv
## Installation
Add this line to your application's Gemfile:
```ruby
gem 'uncsv'
```And then execute:
```sh
bundle
```Or install it yourself as:
```sh
gem install uncsv
```## Usage
Reading a CSV with Uncsv is similar to using Ruby's built-in CSV class. Create
a new instance of `Uncsv` and pass it a `String` or `IO`. The second argument
is an options hash, see below.```ruby
require 'uncsv'data = "A,B,C\n1,2,3"
csv = Uncsv.new(data, header_rows: 0)
csv.map do { |row| row['B'] }
```### Opening a File
Uncsv can read directly from the filesystem with the `open` method.
```ruby
Uncsv.open('my_data.csv')
```### Enumerable Methods
Uncsv is an `Enumerable`. All enumerable methods like `each`, `map`, `reduce`,
etc. are supported.```ruby
data = "A,B,C\n1,2,3\n4,5,6"
csv = Uncsv.new(data, header_rows: 0)
c_total = csv.reduce do { |sum, row| sum + row['C'] }
```### Options
The following options can be passed as a hash to the second argument of the
Uncsv constructor, or set inside the constructor block.```ruby
Uncsv.new(data, skip_blanks: true)# Is equivalent to
Uncsv.new(data) do |config|
config.skip_blanks = true
end
```#### Uncsv Options
- `:expand_headers`: Default `false`. If set to `true`, blank header row cells
will assume the header of the row to their left. This is useful for
heirarchical headers where not all the header cells are filled in. If set to
an array of header indexes, only the specified headers will be expanded.
- `:header_rows`: Default `[]`. Can be set to either a single row index or an
array of row indexes. For example, it could be set to `0` to indicate a
header in the first row. If set to an array of indexes (`[1,2]`), the header
row text will be joined by the `:header_separator`. For example, if if the
cell (0,0) had the value `"Personal"` and cell (1,0) had the value "Name",
the header would become `"Personal.Name"`. Any data above the last header row
will be ignored.
- `:header_separator`: Default `"."`. When using multiple header rows, this is
a string used to separate the individual header fields.
- `:nil_empty`: Default `true`. If `true`, empty cells will be set to `nil`,
otherwise, they are set to an empty string.
- `:normalize_headers`: Default `false`. If set to `true`, header field text
will be normalized. The text will be lowercased, and non-alphanumeric
characters will be replaced with underscores (`_`). If set to a string,
those characters will be replaced with the string instead. If set to a hash,
the hash will be treated as options to KeyNormalizer, accepting the
`:separator`, and `:downcase` options. If set to another object, it is
expected to respond to the `normalize(key)` method by returning a normalized
string.
- `:skip_blanks`: Default `false`. If `true`, rows whose fields are all empty
will be skipped.
- `:skip_rows`: Default `[]`. If set to an array of row indexes, those rows
will be skipped. This option does not apply to header rows.
- `:unique_headers`: Default `false`. If set to `true`, headers will be forced
to be unique by appending numbers to duplicates. For example, if two header
cells have the text `"Name"`, the headers will become `"Name.0"`, and
`"Name.1"`. The separator between the text and the number can be set using
the `:header_separator` option.#### Options from Std-lib CSV
See the documentation for Ruby's built-in `CSV` class for the following
options.- `:col_sep`
- `:field_size_limit`
- `:quote_char`
- `:row_sep`
- `:skip_blanks`## Development
After checking out the repo, run `bundle` to install dependencies. You
can also run `bin/console` for an interactive prompt that will allow you to
experiment.To check your work, run `bin/rspec` run the tests and `bin/rubocop` to check
style. To generate a code coverage report, set the `COVERAGE` environment
variable when running the tests.```sh
COVERAGE=1 bin/rspec
bin/rubocop
```## Contributing
Bug reports and pull requests are welcome on GitHub at
https://github.com/nullscreen/uncsv.