Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/joker1007/embulk-input-cassandra
https://github.com/joker1007/embulk-input-cassandra
Last synced: 2 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/joker1007/embulk-input-cassandra
- Owner: joker1007
- License: mit
- Created: 2020-02-09T04:28:58.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2024-05-13T19:52:13.000Z (6 months ago)
- Last Synced: 2024-10-18T08:36:42.095Z (21 days ago)
- Language: Java
- Size: 211 KB
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Cassandra input plugin for Embulk
![Java CI](https://github.com/joker1007/embulk-input-cassandra/workflows/Java%20CI/badge.svg)## Overview
* **Plugin type**: input
* **Resume supported**: no
* **Cleanup supported**: no
* **Guess supported**: no## Caution
In current, version of netty components conflicts to one that is used by embulk-core.This probrem is very severe.
I tested this plugin on embulk-0.9.22.
But future embulk version may break this plugin.## Support Data types
| CQL Type | Embulk Type | Descritpion |
| -------- | ----------- | -------------- |
| ascii | string | use `toString` or `toJson` |
| bigint | long | |
| blob | unsupported | |
| boolean | boolean | |
| counter | long | |
| date | timestamp | |
| decimal | string | use `toString` |
| double | double | |
| float | double | |
| inet | string | use toHostAddress() |
| int | string, boolean(as 0 or 1), long, double | overflowed value is reset to 0 |
| list | json | |
| map (support only text key) | json | |
| set | json | |
| smallint | long | overflowed value is reset to 0 |
| text | string | use `toString` or `toJson` |
| time | long | use millisecond from beginning of day |
| timestamp | timestamp | long and double as epoch second |
| timeuuid | string |
| uuid | string |
| varchar | string | use `toString` or `toJson` |
| varint | long | |
| UDT, tuple | unsupported | |## Configuration
- **hosts**: list of seed hosts (list, required)
- **port**: port number for cassandra cluster (integer, default: `9042`)
- **username**: cluster username (string, default: `null`)
- **password**: cluster password (string, default: `null`)
- **cluster_name**: cluster name (string, default: `null`)
- **keyspace**: target keyspace name (string, required)
- **table**: target table name (string, required)
- **concurrency**: Thread count of fetcher (integer, default: count of processors)
- **select**: select column names (list, default: "all columns")
- **filter_by_partition_keys**: set where clause (list or list>, default: [])
- **connect_timeout**: Set connect timeout millisecond (integer, default: `5000`)
- **request_timeout**: Set each request timeout millisecond (integer, default: `12000`)Input schema is detected automatically from schema of Cassandra.
## Example
```yaml
in:
type: cassandra
keyspace: embulk_test
table: "test_basic"
concurrency: 8
```## Build
```
$ ./gradlew gem # -t to watch change of files and rebuild continuously
```