https://github.com/kyleburton/abstract-tables
Ruby ETL Utilities
https://github.com/kyleburton/abstract-tables
Last synced: 9 months ago
JSON representation
Ruby ETL Utilities
- Host: GitHub
- URL: https://github.com/kyleburton/abstract-tables
- Owner: kyleburton
- Created: 2011-02-02T21:04:02.000Z (almost 15 years ago)
- Default Branch: master
- Last Pushed: 2011-04-06T20:32:34.000Z (almost 15 years ago)
- Last Synced: 2025-04-13T04:06:14.553Z (9 months ago)
- Language: Ruby
- Homepage:
- Size: 379 KB
- Stars: 6
- Watchers: 2
- Forks: 3
- Open Issues: 11
-
Metadata Files:
- Readme: README.textile
Awesome Lists containing this project
README
h1. Abstract Table Library
In the spirit of the standard Unix utilites (like cat, grep, cut, etc.), this library creates an abstraction over tabular data. The library implements a series of drivers for sources with different encodings that act like sequences.
h1. Command Line Utilities
h2. atcat
# dump a postgres table to the console (default is tab delimited)
atcat dbi://user:pass@Pg/localhost/db_name/table_name | atview | less
# if you create an $HOME/.abtab.dbrc (see below), you can omit the user/pass:
atcat dbi://pg/localhost/database_name/table_name
# export a table to a tab file:
atcat dbi://pg/localhost/database_name/table_name tab://table_name.tab
# you can omit the schema for files, the schema will be guessed based on the file extension
atcat dbi://pg/localhost/database_name/table_name table_name.tab
# dump to csv
atcat dbi://pg/localhost/database_name/table_name csv://table_name.csv
# you can omit the scehma
atcat dbi://pg/localhost/database_name/table_name table_name.csv
# convert from csv to tab
atcat csv://some-file.csv tab://some-file.tab
atcat some-file.csv some-file.tab
# convert form csv to pipe
atcat csv://some-file.csv 'tab://some-file.tab?col_sep=|'
h2. atview
atview tab://some-file.tab | less
atview csv://some-file.csv | less
atview dbi://pg/localhost/database_name/table_name | less
# view a list of tables:
atview dbi://pg/localhost/database_name | less
h2. atgrep
Filter a table source by passing an expression, which has access to the record via the var @r@:
atgrep -e 'r[0] == "this"' test/fixtures/files/file1.csv | atview
# ruby is mutable, so if you feel like being naughty, you can modify during the grep:
atgrep -e 'r[0] = r[0].upcase; r[0] == "THIS"' test/fixtures/files/file1.csv
# but that's what atmod is for...
h2. atcut
Filter a source, selecting specific columns.
# pull only the first and 3rd column:
bin/atcut -f 1,3 test/fixtures/files/file1.tab
h1. Suported Drivers
h2. tab
Native Ruby for now.
h3. col_sep
Defaults to a tab character.
h2. csv
Via the "FasterCSV":http://fastercsv.rubyforge.org/ ruby gem.
h3. col_sep
Override the default ',' column seperator.
h3. quote_char
Override the default quote_char (").
h2. dbi
(only tested with PostgreSQL)
h3. @$HOME/.abtab.dbrc@
With the dbi driver, you can create the file @.abtab.dbrc@ in your home directory and include the connection credentails for the databases you commonly use. It must be a YAML file, you can create it from a ruby script or irb with:
require 'yaml'
cfg = {
"localhost" => {
"database_name" => {
"user" => "username",
"pass" => "password"
}
}
}
File.open("#{ENV['HOME']}/.abtab.dbrc","w") do |f|
f.puts cfg.to_yaml
end
h1. License
h1. Authors
Kyle Burton