An open API service indexing awesome lists of open source software.

https://github.com/castle/avro-filter

Tool for selecting records from Avro files using a filter
https://github.com/castle/avro-filter

Last synced: 4 months ago
JSON representation

Tool for selecting records from Avro files using a filter

Awesome Lists containing this project

README

          

## avro-filter

Reads Avro files and writes all records that matches a filter expression to a new Avro file

Filters are specified using the `-f` parameter, eg.

```bash
java -jar avro-filter.jar -o out.avro -f user_id=1,status=failed transactions.avro
```

```bash
$ java -jar avro-filter.jar --help
avro-filter 0.1
Usage: avro-filter [options] ...

-f, --filter k1=v1,k2=v2...
filter expression, eg. user_id=1
-o, --out output file
-s, --schema optional schema to use when reading
--help prints this usage text
... input file(s)
```

## TODO

- [ ] Handle multiple input files
- [ ] Split output file in configurable chunks (max size)
- [ ] Configurable compression options

## Development

run with

```bash
sbt "run -o out.avro -f user_id=1 -s schema.avro input.avro"
```

## Build

Build a JAR containg all dependencies
```bash
sbt assembly
```