https://github.com/Horb/aggr

An aggregation pipe.
https://github.com/Horb/aggr

Last synced: 4 months ago
JSON representation

An aggregation pipe.

Host: GitHub
URL: https://github.com/Horb/aggr
Owner: Horb
License: mit
Created: 2015-07-06T22:24:29.000Z (about 10 years ago)
Default Branch: master
Last Pushed: 2015-07-20T12:08:12.000Z (almost 10 years ago)
Last Synced: 2024-10-31T21:35:39.225Z (9 months ago)
Language: Python
Size: 111 KB
Stars: 5
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# aggr

An aggregation pipe.

$ cat data.txt
Eggs,12
Chips,13
Beans,14
Eggs,21
Chips,32
Beans,43

$ cat data.txt | aggr -p key,sum
Beans,57
Chips,45
Eggs,33

## Why?

Answers to "How do I sum like `uniq -c `" weren't satisfactory.

## Usage

Feed your data through a pipe or specify a file using the `-i` flag. Provide a pattern `-p`. Your pattern must have the same number of fields and delimiter as your data. Possible fields for the pattern are:

* key - the field should form part of your key
* sum
* max
* min
* len - count the records
* first
* last
* any - returns True if any of the fields values are "Truthy", False otherwise.
* A lambda function that can be passed to `reduce`. See below.

## Examples

Dates, times, integers and floats are inferred.

$ cat data2.txt
2015-05-13,Eggs,1200
2015-05-13,Chips,1300
2015-05-13,Chips,1300
2015-05-13,Bean,1300
2015-05-15,Eggs,1300
2015-05-15,Eggs,1300
2015-05-15,Eggs,1300
2015-05-15,Chips,1300
2015-05-15,Beans,1300

$ cat data2.txt | aggr -p key,len,sum
2015-05-13,4,5100
2015-05-15,5,6500

Composite keys are supported.

$ cat data2.txt | aggr -p key,key,len
2015-05-13,Bean,1300
2015-05-13,Chips,2600
2015-05-13,Eggs,1200
2015-05-15,Beans,1300
2015-05-15,Chips,1300
2015-05-15,Eggs,3900

Custom aggregators are supported. Since lambdas use commas to separate parameters you must switch your delimiter to something other than a comma.

$ cat data2.txt | sed 's/,/#/g' |
aggr -d '#' -p 'key#key#lambda acc,v: int(acc + v / 100)' |
sed 's/#/,/g'

2015-05-13,Bean,13
2015-05-13,Chips,26
2015-05-13,Eggs,12
2015-05-15,Beans,13
2015-05-15,Chips,13
2015-05-15,Eggs,39

## Install

$ git clone http://github.com/Horb/aggr.git
$ ln -s ~/aggr/aggr.py /usr/local/bin/aggr
$ aggr --help
usage: aggr [-h] -p PATTERN [-i INFILE] [-o OUTFILE] [-d FIELD_DELIMITER] [-v]
[--datetime-format DATETIME_FORMAT] [--time-format TIME_FORMAT]
[--date-format DATE_FORMAT]

optional arguments:
-h, --help show this help message and exit
-p PATTERN, --pattern PATTERN
-i INFILE, --infile INFILE
-o OUTFILE, --outfile OUTFILE
-d FIELD_DELIMITER, --field-delimiter FIELD_DELIMITER
-v, --verbose
--datetime-format DATETIME_FORMAT
--time-format TIME_FORMAT
--date-format DATE_FORMAT

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Horb/aggr

Awesome Lists containing this project

README