https://github.com/enigmacurry/deepstream.io-storage-cassandra
Cassandra storage connector for deepstream.io
https://github.com/enigmacurry/deepstream.io-storage-cassandra
cassandra deepstream
Last synced: 24 days ago
JSON representation
Cassandra storage connector for deepstream.io
- Host: GitHub
- URL: https://github.com/enigmacurry/deepstream.io-storage-cassandra
- Owner: EnigmaCurry
- License: other
- Created: 2017-02-17T19:14:14.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2017-02-19T22:10:16.000Z (over 9 years ago)
- Last Synced: 2025-06-03T14:15:47.296Z (about 1 year ago)
- Topics: cassandra, deepstream
- Language: JavaScript
- Size: 26.4 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# deepstream.io-storage-cassandra
[deepstream](https://deepstream.io) storage connector for [Apache Cassandra](http://cassandra.apache.org)
Code status: trailblazer - I have not used this code in production yet.
## Configuration options
```yaml
plugins:
storage:
name: cassandra
options:
db_hosts:
- ${CASSANDRA_HOST}
keyspace: deepstream # This keyspace has to already exist
# optional (specify only if you want different defaults)
defaultTable: global
defaultPrimaryKey:
- name: pk
type: text
- name: k1
type: text
- name: k2
type: text
- name: k3
type: text
```
* db_hosts - The initial list of Cassandra nodes for the driver to
connect to.
* keyspace - The name of the Cassandra keyspace for deepstream to
manage. It must be created beforehand.
* defaultTable - **optional** - The default table to store records that
don't specify a table name. 'global' if unspecified.
* defaultPrimaryKey - **optional** - The default key columns to create on new
tables. The default is to use all text fields. You can specify
non-text fields if you wish, but you will have to do extra frontend
validation in valve to prevent using invalid keys in this case.
## How records are mapped to Cassandra rows:
This connector decomposes the deepstream record key into a composite
key with [clustering
columns](http://cassandra.apache.org/doc/latest/cql/ddl.html#clustering-columns). This is the format:
```{table_name}/{partition_key}/{optional_1st_cluster_key}/.../{optional_nth_cluster_key}```
Data associated with the key is reduced to a JSON blob. This allows
you to query on any of the parts of the record key, but not on the
contents.
For example, a deepstream record might look like this:
* key: ```user/ryan/settings```
* data: ```{ defaultView: 'messages', allowMessages: ['admin', 'mod']}```
Cassandra would store such a record this way (assuming defaultPrimaryKey hasn't been modified):
* ```CREATE TABLE IF NOT EXISTS user (pk text, k1 text, k2 text, k3 text, data text, PRIMARY KEY (pk, k1, k2, k3));```
* ```INSERT INTO user JSON '{ pk:"ryan", k1:"settings", k2:"", k3:"", data: /*serialized data here*/ }'```
k1, k2, and k3 are the clustering columns. A deepstream record key
does not need to specify all of the cluster keys, but those that it
omits will be set to a blank string '' (as is the case here with k2
and k3.) If a record key specifies *more* cluster keys than exist on
the table, they will spill over into the last cluster column. For
instance, the key ```user/ryan/one/two/three/four``` would look like this
in cassandra, note k3, the last key column, is allowed to have '/' in
it:
pk | k1 | k2 | k3 | data
------+-----+-----+------------+------------------
ryan | one | two | three/four | /* serialized data */
The client really does not need to worry about these details, but it
is useful to understand how the data is stored so that you can make
efficient queries. [See more examples in the the connector code
here](https://github.com/EnigmaCurry/deepstream.io-storage-cassandra/blob/master/src/connector.js)