Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.
Awesome Lists | Featured Topics | Projects
https://github.com/tarantool/tarantool-java

A Java client for Tarantool
https://github.com/tarantool/tarantool-java
Last synced: about 2 months ago
JSON representation
A Java client for Tarantool
Host: GitHub
URL: https://github.com/tarantool/tarantool-java
Owner: tarantool
License: bsd-3-clause
Fork: true (dork/tarantool-java)
Created: 2013-10-30T19:38:35.000Z (about 11 years ago)
Default Branch: master
Last Pushed: 2023-01-12T03:29:40.000Z (almost 2 years ago)
Last Synced: 2024-08-03T22:06:56.223Z (5 months ago)
Language: Java
Homepage: http://tarantool.org
Size: 3.16 MB
Stars: 52
Watchers: 28
Forks: 18
Open Issues: 63
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

awesome-tarantool - JDBC driver - - offers JDBC interface for a single Tarantool node and supports ancient Tarantool versions (1.7+). (Connectors / Administration)
README

        

   



# Deprecation note

This connector is not in the active development and de facto deprecated.

Consider using the modern alternatives: [cartridge-java][cartridge-java] and

[cartridge-springdata][cartridge-springdata]. It is based on Netty framework

and supports single node/cluster queries.

You may be still interesting in tarantool-java connector if you're using

Tarantool via JDBC.

[cartridge-java]: https://github.com/tarantool/cartridge-java

[cartridge-springdata]: https://github.com/tarantool/cartridge-springdata

# Java connector for Tarantool

[![Maven Central](https://maven-badges.herokuapp.com/maven-central/org.tarantool/connector/badge.svg)](https://maven-badges.herokuapp.com/maven-central/org.tarantool/connector)

[![Build Status](https://travis-ci.org/tarantool/tarantool-java.svg?branch=master)](https://travis-ci.org/tarantool/tarantool-java)

[![Coverage Status](https://coveralls.io/repos/github/tarantool/tarantool-java/badge.svg?branch=master)](https://coveralls.io/github/tarantool/tarantool-java?branch=master)

To get the Java connector for Tarantool 1.6.9, visit

[this GitHub page](https://github.com/tarantool/tarantool-java/tree/connector-1.6.9).

## Table of contents

* [Getting started](#getting-started)

* [Spring NamedParameterJdbcTemplate usage example](#spring-namedparameterjdbctemplate-usage-example)

* [JDBC](#JDBC)

* [Cluster support](#cluster-support)

* [Logging](#logging)

* [Building](#building)

* [Where to get help](#where-to-get-help)

## Getting started

1. Add a dependency to your `pom.xml` file (look for a last version on the

   [releases page](https://github.com/tarantool/tarantool-java/releases)):

```xml

  org.tarantool

  connector

  1.9.4

```

2. Configure `TarantoolClientConfig`:

```java

TarantoolClientConfig config = new TarantoolClientConfig();

config.username = "test";

config.password = "test";

```

3. Create a client:

```java

TarantoolClient client = new TarantoolClientImpl("host:3301", config);

```

using `TarantoolClientImpl(String, TarantoolClientConfig)` is equivalent to:

```java

SocketChannelProvider socketChannelProvider = new SingleSocketChannelProviderImpl("host:3301")

TarantoolClient client = new TarantoolClientImpl(socketChannelProvider, config);

```

You could implement your own `SocketChannelProvider`. It should return 

a connected `SocketChannel`. Feel free to implement `get(int retryNumber, Throwable lastError)`

using your appropriate strategy to obtain the channel. The strategy can take into

account current attempt number (retryNumber) and the last transient error occurred on

the previous attempt.

The `TarantoolClient` will be closed if your implementation of a socket

channel provider raises exceptions. However, throwing a `SocketProviderTransientException`

or returning `null` value are handled by the client as recoverable errors. In these cases,

the client will make next attempt to obtain the socket channel. Otherwise, you will need

a new instance of client to recover. Hence, you should only throw an error different

to `SocketProviderTransientException` in case you have met unrecoverable error.

Below is an example of `SocketChannelProvider` implementation that tries

to connect no more than 3 times, two seconds for each attempt at max.

```java

SocketChannelProvider socketChannelProvider = new SocketChannelProvider() {

    @Override

    public SocketChannel get(int retryNumber, Throwable lastError) {

        if (retryNumber > 3) {

            throw new RuntimeException("Too many attempts");

        }

        SocketChannel channel = null;

        try {

            channel = SocketChannel.open();

            channel.socket().connect(new InetSocketAddress("localhost", 3301), 2000);

            return channel;

        } catch (IOException e) {

            if (channel != null) {

                 try {

                     channel.close();

                 } catch (IOException ignored) { }

            }

            throw new SocketProviderTransientException("Couldn't connect to server", e);

        }

    }

};

```

Same behaviour can be achieved using built-in `SingleSocketChannelProviderImpl`:

```java

TarantoolClientConfig config = new TarantoolClientConfig();

config.connectionTimeout = 2_000; // two seconds timeout per attempt

config.retryCount = 3;            // three attempts at max

SocketChannelProvider socketChannelProvider = new SingleSocketChannelProviderImpl("localhost:3301")

TarantoolClient client = new TarantoolClientImpl(socketChannelProvider, config);

```

`SingleSocketChannelProviderImpl` implements `ConfigurableSocketChannelProvider` that

makes possible for the client to configure a socket provider.

> **Notes:**

> * `TarantoolClient` is thread-safe and asynchronous, so you should use one

>   client inside the whole application.

> * `TarantoolClient` does not support name resolution for fields, indexes,

>   spaces and so on. We highly recommend to use server-side Lua when working

>   with named items. For example, you could create a data access object (DAO)

>   with simple CRUD functions. If, for some reason, you do need client name

>   resolution, you could create a function that returns necessary name-to-ID

>   mappings.

`TarantoolClient` provides four interfaces to execute queries:

* `SyncOps` - returns the operation result

* `AsyncOps` - returns the operation result as a `Future`

* `ComposableAsyncOps` - return the operation result as a `CompletionStage`

* `FireAndForgetOps` - returns the query ID

Feel free to override any method of `TarantoolClientImpl`. For example, to hook

all the results, you could override this:

```java

protected void complete(TarantoolPacket packet, CompletableFuture> future);

```

### Client config options

The client configuration options are represented through the `TarantoolClientConfig` class.

Supported options are follow:

1. `username` is used to authenticate and authorize an user in a Taratool server instance.

   Default value is `null` that means client will attempt to auth as a *guest*.

2. `password` is used to authenticate an user in a Taratool server instance.

   Default value is `null`.

3. `defaultRequestSize` used to be an initial binary buffer size in bytes to send requests.

   Default value is `4096` (4 KB).

4. `predictedFutures` is used to initialize an initial capacity of hash map which stores

   response futures. The client is asynchronous under the hood even though it provides

   a synchronous operations using `java.concurrent.CompletableFuture`.

   Default value is `(1024 * 1024) / 0.75) + 1`.

5. `writerThreadPriority` describes a priority of writer thread.

   Default value is `Thread.NORM_PRIORITY` (5).

6. `readerThreadPriority` describes a priority of reader thread.

   Default value is `Thread.NORM_PRIORITY` (5).

7. `sharedBufferSize` sets a shared buffer size in bytes (place where client collects

   requests when socket is busy on write).

   Default value is `8 * 1024 * 1024` (8 MB).

8. `directWriteFactor` is used as a factor to calculate a threshold whether

   request will be accommodated in the shared buffer. If the request size exceeds

   `directWriteFactor * sharedBufferSize` request is sent directly.

   Default value is `0.5`.

9. `writeTimeoutMillis` sets the max time in ms to perform writing and send the bytes.

    Default value is 60 * 1000 (1 minute).

10. `useNewCall` configures whether client has to use new *CALL* request signature or old

    one used to be active in Tarantool 1.6.

    Default value is `true`.

11. `initTimeoutMillis` sets a max time in ms to establish connection to the server

    Default values is `60 * 1000L` (1 minute).

12. `connectionTimeout` is a hint and can be passed to the socket providers which

    implement `ConfigurableSocketChannelProvider` interface. This hint should be

    interpreter as a connection timeout in ms per attempt where `0` means no limit.

    This options restricts a time budget to perform one connection attempt, while

    `initTimeoutMillis` limits an overall time to obtain a connection.

    Default value is `2 * 1000` (2 seconds).

13. `retryCount` is a hint and can be passed to the socket providers which

    implement `ConfigurableSocketChannelProvider` interface. This hint should be

    interpreter as a maximal number of attempts to connect to Tarantool instance.

    Default value is `3`.  

14. `operationExpiryTimeMillis` is a default request timeout in ms.

    Default value is `1000` (1 second).

## String space/index resolution

Each operation that requires space or index to be executed, can work with

number ID as well as string name of a space or an index.

Assume, we have `my_space` space with space ID `512` and its primary index

`primary` with index ID `0`. Then, for instance, `select` operations can be

performed using their names:

```java

client.syncOps().select(512, 0, Collections.singletonList(1), 0, 1, Iterator.EQ);

// or using more convenient way

client.syncOps().select("my_space", "primary", Collections.singletonList(1), 0, 1, Iterator.EQ);

```

Because _iproto_ has not yet supported string spaces and indexes, a client caches current server

schema in memory. The client relies on protocol SCHEMA_ID and sends each request with respect to

cached schema version. The schema is used primarily to resolve string names of spaces or indexes

against its integer IDs.

### Schema update

1. Just after a (re-)connection to the Tarantool instance.

   The client cannot guarantee that new instance is the same and has same schema,

   thus, the client drops the cached schema and fetches new one.

2. Receiving a schema version error as a response to our request.

   It's possible some request can be rejected by server because of schema

   mismatching between client and server. In this case the schema will be

   reloaded and the refused request will be resent using the updated schema

   version.

3. Sending a DDL request and receiving a new version in a response.

4. Sending a request against a non-existent space/index name.

   The client cannot exactly know whether name was not found because of

   it does not exist or it has not the latest schema version. A ping request

   is sent in the case to check a schema version and then a client will reload

   it if needed. The original request will be retried if a space / an index

   name will be found in a new schema.

### Schema support caveats

1. Each schema reloading requires at least two extra requests to fetch spaces and

   indexes metadata respectively. There is also a ping request followed by reloading

   of the schema to check whether the client has outdated version (see point 4 in

   [Schema update](#schema-update)).

2. In some circumstance, requests can be rejected several times until both client's

   and server's versions matches. It may take significant amount of time or even be

   a cause of request timeout.

3. The client guarantees an order of synchronous requests per thread. Other cases such

   as asynchronous or multi-threaded requests may be out of order before the execution.

## Spring NamedParameterJdbcTemplate usage example

The JDBC driver uses `TarantoolClient` implementation to provide a communication with server.

To configure socket channel provider you should implements SocketChannelProvider and add

`socketChannelProvider=abc.xyz.MySocketChannelProvider` to connect url.

For example:

```

jdbc:tarantool://localhost:3301?user=test&password=test&socketProvider=abc.xyz.MySocketProvider

```

Here is an example how you can use the driver covered by Spring `DriverManagerDataSource`:

```java

NamedParameterJdbcTemplate template = new NamedParameterJdbcTemplate(new DriverManagerDataSource("jdbc:tarantool://localhost:3301?user=test&password=test"));

RowMapper rowMapper = new RowMapper() {

    @Override

    public Object mapRow(ResultSet resultSet, int i) throws SQLException {

        return Arrays.asList(resultSet.getInt(1), resultSet.getString(2));

    }

};

try {

    System.out.println(template.update("drop table hello_world", Collections.emptyMap()));

} catch (Exception ignored) {

}

System.out.println(template.update("create table hello_world(hello int not null PRIMARY KEY, world varchar(255) not null)", Collections.emptyMap()));

Map params = new LinkedHashMap();

params.put("text", "hello world");

params.put("id", 1);

System.out.println(template.update("insert into hello_world(hello, world) values(:id,:text)", params));

System.out.println(template.query("select * from hello_world", rowMapper));

System.out.println(template.query("select * from hello_world where hello=:id", Collections.singletonMap("id", 1), rowMapper));

```

For more implementation details, see [API documentation](http://tarantool.github.io/tarantool-java/apidocs/index.html).

## JDBC

### Batch updates

`Statement` and `PreparedStatement` objects can be used to submit batch

updates.

For instance, using `Statement` object:

```java

Statement statement = connection.createStatement();

statement.addBatch("INSERT INTO student VALUES (30, 'Joe Jones')");

statement.addBatch("INSERT INTO faculty VALUES (2, 'Faculty of Chemistry')");

statement.addBatch("INSERT INTO student_faculty VALUES (30, 2)");

int[] updateCounts = stmt.executeBatch();

```

or using `PreparedStatement`:

```java

PreparedStatement stmt = con.prepareStatement("INSERT INTO student VALUES (?, ?)");

stmt.setInt(1, 30);

stmt.setString(2, "Michael Korj");

stmt.addBatch();

stmt.setInt(1, 40);

stmt.setString(2, "Linda Simpson");

stmt.addBatch();

int[] updateCounts = stmt.executeBatch();

```

The connector uses a pipelining when it performs a batch request. It means

each query is asynchronously sent one-by-one in order they were specified

in the batch.

There are a couple of caveats:

- JDBC spec recommends that *auto-commit* mode should be turned off

to prevent the driver from committing a transaction when a batch request

is called. The connector is not support transactions and *auto-commit* is

always enabled, so each statement from the batch is executed in its own

transaction.

- DDL operations aren't transactional in Tarantool. In this way, a batch

like this can produce an undefined behaviour (i.e. second statement can fail

with an error that `student` table does not exist).

```java

statement.addBatch("CREATE TABLE student (id INT PRIMARY KEY, name VARCHAR(100))");

statement.addBatch("INSERT INTO student VALUES (1, 'Alex Smith')");

```

- If `vinyl` storage engine is used an execution order of batch statements is

not specified. __NOTE:__ This behaviour is incompatible with JDBC spec in the

sentence "Batch commands are executed serially (at least logically) in the

order in which they were added to the batch"

- The driver continues processing the remaining commands in a batch once execution

of a command fails.

## Cluster support

To be more fault-tolerant the connector provides cluster extensions. In

particular `TarantoolClusterClient` and built-in `RoundRobinSocketProviderImpl`

used as a default `SocketProvider` implementation. When currently connected

instance is down then the client will try to reconnect to the first available

instance using strategy defined in a socket provider. You need to supply

a list of nodes which will be used by the cluster client to provide such

ability. Also you can prefer to use a [discovery mechanism](#auto-discovery)

in order to dynamically fetch and apply the node list.

### The RoundRobinSocketProviderImpl class

This cluster-aware provider uses addresses pool to connect to DB server.

The provider picks up next address in order the addresses were passed.

Similar to `SingleSocketChannelProviderImpl` this RR provider also

relies on two options from the config: `TarantoolClientConfig.connectionTimeout`

and `TarantoolClientConfig.retryCount` but in a bit different way.

The latter option says how many times the provider should try to establish a

connection to _one instance_ before failing an attempt. The provider requires

positive retry count to work properly. The socket timeout is used to limit

an interval between connections attempts per instance. In other words, the provider

follows a pattern _connection should succeed after N attempts with M interval between

them at max_. 

### Basic cluster client usage

1. Configure `TarantoolClusterClientConfig`:

```java

TarantoolClusterClientConfig config = new TarantoolClusterClientConfig();

// fill other settings

config.operationExpiryTimeMillis = 2000;

config.executor = Executors.newSingleThreadExecutor();

```

2. Create an instance of `TarantoolClusterClientImpl`. You need to provide

an initial list of nodes:

```java

String[] nodes = new String[] { "myHost1:3301", "myHost2:3302", "myHost3:3301" };

TarantoolClusterClient client = new TarantoolClusterClient(config, nodes);

``` 

3. Work with the client using same API as defined in `TarantoolClient`:

```java

client.syncOps().insert(23, Arrays.asList(1, 1));

```

### Auto-discovery

Auto-discovery feature allows a cluster client to fetch addresses of 

cluster nodes to reflect changes related to the cluster topology. To achieve

this you have to create a Lua function on the server side which returns 

a single array result. Client periodically polls the server to obtain a 

fresh list and apply it if its content changes.

1. On the server side create a function which returns nodes:

```bash

tarantool> function get_cluster_nodes() return { 'host1:3301', 'host2:3302', 'host3:3301' } end

```

You need to pay attention to a function contract we are currently supporting:

* The client never passes any arguments to a discovery function.

* A discovery function _must_ return an array of strings (i.e `return {'host1:3301', 'host2:3301'}`).

* Each string _should_ satisfy the following pattern `host[:port]`

  (or more strictly `/^[^:]+(:\d+)?$/` - a mandatory host containing any string

  and an optional colon followed by digits of the port). Also, the port must be

  in a range between 1 and 65535 if one is presented.

* A discovery function _may_ return multi-results but the client takes

  into account only first of them (i.e. `return {'host:3301'}, discovery_delay`, where 

  the second result is unused). Even more, any extra results __are reserved__ by the client

  in order to extend its contract with a backward compatibility.

* A discovery function _should NOT_ return no results, empty result, wrong type result,

  and Lua errors. The client discards such kinds of results but it does not affect the discovery

  process for next scheduled tasks. 

2. On the client side configure discovery settings in `TarantoolClusterClientConfig`:

```java

TarantoolClusterClientConfig config = new TarantoolClusterClientConfig();

// fill other settings

config.clusterDiscoveryEntryFunction = "get_cluster_nodes"; // discovery function used to fetch nodes 

config.clusterDiscoveryDelayMillis = 60_000;                // how often client polls the discovery server  

```

3. Create a client using the config made above.

```java

TarantoolClusterClient client = new TarantoolClusterClient(config);

client.syncOps().insert(45, Arrays.asList(1, 1));

```

### Auto-discovery caveats

* You need to set _not empty_ value to `clusterDiscoveryEntryFunction` to enable auto-discovery feature.

* There are only two cases when a discovery task runs: just after initialization of the cluster

  client and a periodical scheduler timeout defined in `TarantoolClusterClientConfig.clusterDiscoveryDelayMillis`. 

* A discovery task always uses an active client connection to get the nodes list.

  It's in your responsibility to provide a function availability as well as a consistent

  nodes list on all instances you initially set or obtain from the task.

* Every address which is unmatched with `host[:port]` pattern will be filtered out from

  the target addresses list.

* If some error occurs while a discovery task is running then this task

  will be aborted without any after-effects for next task executions. These cases, for instance, are 

  a wrong function result (see discovery function contract) or a broken connection. 

  There is an exception if the client is closed then discovery process will stop permanently.

* It's possible to obtain a list which does NOT contain the node we are currently

  connected to. It leads the client to try to reconnect to another node from the 

  new list. It may take some time to graceful disconnect from the current node.

  The client does its best to catch the moment when there are no pending responses

  and perform a reconnection.  

### Cluster client config options

In addition to the options for [the standard client](#client-config-options), cluster

config provides some extra options:

1. `executor` defines an executor that will be used as a thread of execution to retry writes.

   Default values is `null` which means the cluster client will use *a single thread executor*.

2. `clusterDiscoveryEntryFunction` is a name of the stored function to be used to fetch list of

   instances.

   Default value is `null` (not set).

3. `clusterDiscoveryDelayMillis` describes how often in ms to poll the server for a new list of

   cluster nodes.

   Default value is `60 * 1000` (1 minute).

## Logging

The connector uses its own logging facade to abstract from any logging libraries

which can be used inside the apps where the connector attached. At the moment,

the facade supports JUL as a default logging system, SLF4J facade, and Logback

directly via SLF4J interface.

### Logging integration

The logging facade offers several ways in integrate its internal logging with foreign one in order:

* Using system property `org.tarantool.logging.provider`. Supported values are *jdk* and *slf4j*

  for the java util logging and SLF4J/Logback respectively. For instance, use 

  `java -Dorg.tarantool.logging.provider=slf4j <...>`.

* Using Java SPI mechanism. Implement your own provider org.tarantool.logging.LoggerProvider

  To register your provider save `META-INF.services/org.tarantool.logging.LoggerProvider` file

  with a single line text contains a fully-qualified class name of the provider implementation.

```bash

cat META-INF/services/org.tarantool.logging.LoggerProvider

org.mydomain.MySimpleLoggerProvider

```

* CLASSPATH exploring. Now, the connector will use SLF4J if Logback is also in use. 

* If nothing is successful JUL will be used by default.

### Supported loggers

| Logger name                                    | Level | Description                                       |

| ---------------------------------------------- | ----- | ------------------------------------------------- |

| o.t.c.TarantoolClusterStoredFunctionDiscoverer | WARN  | prints out invalid discovery addresses            |

| o.t.TarantoolClusterClient                     | TRACE | prints out request retries after transient errors |

| o.t.TarantoolClientImpl                        | WARN  | prints out reconnect issues                       |

## Building

To run unit tests use:

 

```bash

./mvnw clean test

``` 

To run integration tests use:

```bash

./mvnw clean verify

```

## Where to get help

Got problems or questions? Post them on

[Stack Overflow](http://stackoverflow.com/questions/ask/advice) with the

`tarantool` and `java` tags, or use these tags to search the existing knowledge

base for possible answers and solutions.