Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/joker1007/embulk-output-cassandra
Apache Cassandra output plugin for Embulk.
https://github.com/joker1007/embulk-output-cassandra
Last synced: 2 days ago
JSON representation
Apache Cassandra output plugin for Embulk.
- Host: GitHub
- URL: https://github.com/joker1007/embulk-output-cassandra
- Owner: joker1007
- License: mit
- Created: 2018-07-01T14:55:21.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2024-05-13T14:34:35.000Z (6 months ago)
- Last Synced: 2024-10-20T12:20:14.380Z (19 days ago)
- Language: Java
- Homepage:
- Size: 216 KB
- Stars: 1
- Watchers: 3
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Cassandra output plugin for Embulk
![Java CI](https://github.com/joker1007/embulk-output-cassandra/workflows/Java%20CI/badge.svg)
Apache Cassandra output plugin for Embulk.
## Compatibility
| embulk-output-kafka | embulk | datastax-driver-core |
| --------------------- | --------------------------------------- | --------------------- |
| 0.6.x | 0.11.x or later | 4.x |
| 0.5.x | 0.9.x or later (may not work on 0.11.x) | 3.11.x |## Breaking Changes
### 0.6.0
- `timestamp` column accepts string as Java's ISO_INSTANT format.
- `timestamp` column accepts long and double as epoch millis. (before: as epoch seconds)
- `date` column accepts long as days from epoch. (before: not supported)## Overview
* **Plugin type**: output
* **Load all or nothing**: no
* **Resume supported**: yes
* **Cleanup supported**: no## Caution
In current, version of netty components conflicts to one that is used by embulk-core.This probrem is very severe.
I tested this plugin on embulk-0.9.7.
But future embulk version may break this plugin.## Support Data types
| CQL Type | Embulk Type | Descritpion |
| -------- | ----------- | -------------- |
| ascii | string, boolean, long, double, timestamp, json | use `toString` or `toJson` |
| bigint | string, boolean(as 0 or 1), long, double | |
| blob | unsupported | |
| boolean | boolean, long, double | 0 == false, 1 == true |
| counter | unsupported | |
| date | string, long, timestamp | long as days from epoch, timestamp as UTC timestamp |
| decimal | string, boolean(as 0 or 1), long, double | |
| double | string, boolean(as 0 or 1), long, double | |
| float | string, boolean(as 0 or 1), long, double | |
| inet | string | |
| int | string, boolean(as 0 or 1), long, double | overflowed value is reset to 0 |
| list | json | |
| map (support only text key) | json | |
| set | json | |
| smallint | string, boolean(as 0 or 1), long, double | overflowed value is reset to 0 |
| text | string, boolean, long, double, timestamp, json | use `toString` or `toJson` |
| time | string, long, double, timestamp | long and double as nano seconds of day,
timestamp as UTC timestamp |
| timestamp | string, long, double, timestamp | string as Java's ISO_INSTANT format, long and double as epoch millis |
| timeuuid | null | |
| uuid | null | |
| varchar | string, boolean, long, double, timestamp, json | use `toString` or `toJson` |
| varint | string, boolean(as 0 or 1), long, double | |
| UDT | unsupported | |## Insert Behavior
If embulk record does not have a column, it is treated as `unset`.
If same key record already exists, the column is not touched.### Counter table
This plugin supports counter table.But counter table supports only increment/decrement update.
Because of it, This plugin uses input value as increment value;
For example, If input data = {id: 1, count: 5}, Executed Statement is `UPDATE tablename SET count = count + 5 WHERE id = 1`
## Configuration
- **hosts**: list of seed hosts (list, required)
- **port**: port number for cassandra cluster (integer, default: `9042`)
- **username**: cluster username (string, default: `null`)
- **password**: cluster password (string, default: `null`)
- **cluster_name**: cluster name (string, default: `null`)
- **keyspace**: target keyspace name (string, required)
- **table**: target table name (string, required)
- **mode**: insert or update or delete (string, default: `"insert"`)
- **if_not_exists**: Add "IF NOT EXISTS" to INSERT query (boolean, default: `false`)
- **if_exists**: Add "IF EXISTS" to UPDATE query (boolean, default: `false`)
- **ttl**: Add "TTL" to INSERT query (integer, default: `null`)
- **idempotent**: Treat INSERT query as idempotent (boolean, default: `false`)
- **connect_timeout**: Set connect timeout millisecond (integer, default: `5000`)
- **request_timeout**: Set each request timeout millisecond (integer, default: `12000`)## Example
```yaml
out:
type: cassandra
hosts:
- 127.0.0.1
port: 9042
keyspace: sample_keyspace
table: sample_table
idempotent: true
```## Build
```
$ ./gradlew gem # -t to watch change of files and rebuild continuously
```