Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/mridang/jacksandra

A Jackson-based automatic schema-generator for Cassandra
https://github.com/mridang/jacksandra
cassandra cql jackson schema-gen
Last synced: about 2 months ago
JSON representation
A Jackson-based automatic schema-generator for Cassandra
Host: GitHub
URL: https://github.com/mridang/jacksandra
Owner: mridang
License: apache-2.0
Created: 2021-04-25T09:19:09.000Z (almost 4 years ago)
Default Branch: main
Last Pushed: 2022-07-28T17:34:15.000Z (over 2 years ago)
Last Synced: 2024-11-04T16:44:28.636Z (3 months ago)
Topics: cassandra, cql, jackson, schema-gen
Language: Scala
Homepage:
Size: 360 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        # Jacksandra

Jacksandra (aptly named) is a Jackson-based module for working with Cassandra rows. At the time of writing, Jacksandra can only be used for schema generation

using Jackson, but a future version will add support for ser-deser to types to the corresponding "row" objects.

## Installation

Unfortunately, Jacksandra is not available in any public Maven repositories except the GitHub Package Registry. For more information on how to install packages

from the GitHub Package

Registry, [https://docs.github.com/en/packages/guides/configuring-gradle-for-use-with-github-packages#installing-a-package][see the GitHub docs]

## Usage

As the Datastax Mappers provide no support for schemagen, this module is supposed to be a drop-in module to augment existing functionality provided by the

existing Datastax libraries.

Assume you have the following bean for which you would like to generate the schema:

```java

import java.util.Date;

import com.datastax.oss.driver.api.mapper.annotations.CqlName;

import com.datastax.oss.driver.api.mapper.annotations.Entity;

import com.datastax.oss.driver.api.mapper.annotations.PartitionKey;

import com.mridang.jacksandra.annotations.OrderedClusteringColumn;

import com.mridang.jacksandra.types.FrozenList;

@Entity(defaultKeyspace = "mykeyspace")

@CqlName("brandsimilarities")

public class BrandSimilarities {

    @SuppressWarnings("DefaultAnnotationParam")

    @PartitionKey(0)

    @CqlName("brand")

    public String brand;

    @SuppressWarnings("DefaultAnnotationParam")

    @OrderedClusteringColumn(isAscending = false, value = 0)

    @CqlName("createdat")

    public Date createdAt;

    @OrderedClusteringColumn(isAscending = true, value = 1)

    @CqlName("skuid")

    public String skuId;

    @CqlName("related")

    public FrozenList productRelations;

    @CqlName("relation")

    public static class Relation {

        @CqlName("brand")

        public String brand;

        @CqlName("skuid")

        public String skuId;

        @CqlName("score")

        public Float score;

    }

}

```

To generate the schema for the bean described above, you would run:

```scala

    val mapper = new CassandraJavaBeanMapper[BrandSimilarities]()

val createSchema: String = mapper.generateMappingProperties

```

to yield the DDL:

```sql

CREATE 

  TYPE 

IF NOT 

EXISTS relation 

     ( score FLOAT

     , skuid TEXT

     , brand TEXT

     );

CREATE 

 TABLE 

IF NOT 

EXISTS brandsimilarities 

     ( brand TEXT

     , shardkey TINTINT

     , createdat TEXT

     , skuid TEXT

     , related LIST>

     , PRIMARY KEY

       ( ( brand

         , shardkey

          )

       , createdat

       , skuid

       )

    )

  WITH CLUSTERING 

 ORDER 

    BY 

     ( createdat DESC

     , skuid ASC

     );

```

### Types

The follow exhaustive list outlines all the CQL types along with the JVM counterparts.

| CQL         |           Java 
|-------------|--------------------- 
| `BOOLEAN`   | `java.lang.Boolean` 
| `TEXT`      | `java.lang.String` 
| `VARCHAR`   | `java.lang.String` 
| `FLOAT`     | `java.lang.Float` 
| `DOUBLE`    | `java.lang.Double` 
| `BIGINT`    | `java.lang.Long` 
| `INT`       | `java.lang.Integer` 
| `SMALLINT`  | `java.lang.Short` 
| `TINYINT`   | `java.lang.Byte` 
| `VARINT`    | `java.math.BigInteger` 
| `DECIMAL`   | `java.math.BigDecimal` 
| `ASCII`     | `com.mridang.jacksan 
| `INET`      | `java.net.InetAddress` 
| `UUID`      | `java.util.UUID` 
| `TIMEUUID`  | `com.mridang.jacksan 
| `DURATION`  | `com.mridang.jacksan 
|             | `java.time.Duration` 
| `BLOB`      | `com.mridang.jacksan 
| `DATE`      | `java.time.LocalDate` 
| `TIME`      | `java.time.LocalTime` 
| `TIMESTAMP` | `java.sql.Timestamp` 
|             | `java.util.Instant` 
|             | `java.time.LocalDateTime`

|          Scala         | -----------------------|-----------------------:| |                        | |                        | |                        | |                        | |                        | |                        | |                        | |                        | |                        | |                        | |                        | dra.types.CqlAscii`    |                        | |                        | |                        | dra.types.CqlTimeUUID` |                        | dra.types.CqlDuration` |                        | |                        | dra.types.CqlBlob`     |                        | |                        | |                        | |                        | |                        | |                        |

### Collections

Jacksandra supports all collection types including the "frozen" variants. Any property that derives from `java.util.Collection` will be mapped as a `LIST` data

type. If you require a "frozen" representation, use any collection type simply implement the `Frozen` interface.

#### Lists

When using Java: any property that derives from `java.util.List` will be mapped as a `LIST` data type. If you require a "frozen" representation,

use `FrozenList` when possible.

When using Scala: any property that derives from `scala.collection.mutable.List`

will bbe mapped as `LIST` data type. If you require a "frozen" representation, use `scala.collection.immutable.List` when possible.

#### Sets

When using Java: any property that derives from `java.util.Set` will be mapped as a `SET` data type. If you require a "frozen" representation, use `FrozenSet`

when possible.

When using Scala: any property that derives from `scala.collection.mutable.Set`

will bbe mapped as `LIST` data type. If you require a "frozen" representation, use `scala.collection.immutable.Set` when possible.

### Maps

AnWhen using Java: any property that derives from `java.util.Map` will be mapped as a `MAP` data type.

When using Scala: any property that derives from `scala.collection.mutable.Map`

will bbe mapped as `MAP` data type.

### Tuples

At the time of writing, there is no support for tuples. See https://github.com/mridang/jacksandra/issues/4

### Partition Keys

Use the `@PartitionKey` annotation to denote the partition keys.

One more properties of a schema must have a `@PartitionKey` annotation.

### Clustering Columns

Use the `@ClusteringColumn` annotation or the `@OrderedClusteringColumn` to denote that a column is a part of the clustering key. The

custom  `@OrderedClusteringColumn` annotation has been added as the `com.datastax.oss.driver.api.mapper.annotations.ClusteringColumn` annotation provided by the

Datastax libraries don't support specifying the clustering order.

A schema may or may not have a `@OrderedClusteringColumn` annotation or the `@ClusteringColumn`

annotation at all static columns are optional.

### Static Columns

Use the `@StaticColumn` annotation to denote static columns. The custom `@StaticColumn`

annotation is provided as there doesn't seem to be corresponding annotation in the Datastax libraries.

A schema may or may not have a `@StaticColumn` annotation at all static columns are optional.

### Adding custom types

You can easily add support for custom types.

## Usage (Spark)

```scala

@CqlName("mytable")

class MyBean(@PartitionKey val ssn: Int, val firstName: String, val lastName: String)

  extends Serializable

```

### Writing

In order to persist an RDD into a Cassandra table, you can use the following. The connector must be provided as an implicit variable.

This method does not create the keyspace, the table or any types. The name of the table is read from the `CqlName` annotation on the entity.

```scala

import com.datastax.spark.connector.plus.toRDDFunctions

implicit val connector: CassandraConnector = CassandraConnector(sc.getConf)

implicit val rwf: RowWriterFactory[MyBean] = new CassandraJsonRowWriterFactory[MyBean]

val inputRDD: RDD[MyBean] = RandomRDD[MyBean](sc).of(randomItemsCount)

inputRDD.saveToCassandra("mykeyspace")

```

### Reading

In order to read a Cassandra table into an RDD, you can use the following. The connector must be provided as an implicit variable.

This method does not create the keyspace, the table or any types. The name of the table is read from the `CqlName` annotation on the entity.

```scala

import com.datastax.spark.connector.plus.toSparkContextFunctions

implicit val connector: CassandraConnector = CassandraConnector(sc.getConf)

val inputRDD: RDD[MyBean] = sc.cassandraTable[MyBean]("mykeyspace")

```

## License

Apache-2.0 License

Copyright (c) 2021 Mridang Agarwalla

[see the GitHub docs]: https://docs.github.com/en/packages/guides/configuring-gradle-for-use-with-github-packages#installing-a-package