{"id":13697512,"url":"https://github.com/rayokota/kareldb","last_synced_at":"2025-04-08T11:14:41.709Z","repository":{"id":38050586,"uuid":"206686299","full_name":"rayokota/kareldb","owner":"rayokota","description":"A Relational Database Backed by Apache Kafka","archived":false,"fork":false,"pushed_at":"2025-03-17T21:41:16.000Z","size":851,"stargazers_count":389,"open_issues_count":10,"forks_count":27,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-04-01T10:09:38.819Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rayokota.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-09-06T01:17:24.000Z","updated_at":"2025-02-23T06:22:05.000Z","dependencies_parsed_at":"2024-01-09T08:31:06.701Z","dependency_job_id":"280a4245-bda7-42f6-9361-0f13ffc1f129","html_url":"https://github.com/rayokota/kareldb","commit_stats":{"total_commits":391,"total_committers":4,"mean_commits":97.75,"dds":0.5319693094629157,"last_synced_commit":"b113cbef30e816b0d60e5f62e642293cdd47faa6"},"previous_names":[],"tags_count":20,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rayokota%2Fkareldb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rayokota%2Fkareldb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rayokota%2Fkareldb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rayokota%2Fkareldb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rayokota","download_url":"https://codeload.github.com/rayokota/kareldb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247829512,"owners_count":21002997,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T18:00:59.572Z","updated_at":"2025-04-08T11:14:41.686Z","avatar_url":"https://github.com/rayokota.png","language":"Java","funding_links":[],"categories":["Libraries","NewSQL Databases","`NewSQL Databases`","数据库","\u003ca name=\"Java\"\u003e\u003c/a\u003eJava","Java"],"sub_categories":["Kafka"],"readme":"# KarelDB - A Relational Database Backed by Apache Kafka\n\n[![Build Status][github-actions-shield]][github-actions-link]\n[![Maven][maven-shield]][maven-link]\n[![Javadoc][javadoc-shield]][javadoc-link]\n\n[github-actions-shield]: https://github.com/rayokota/kareldb/workflows/build/badge.svg?branch=master\n[github-actions-link]: https://github.com/rayokota/kareldb/actions\n[maven-shield]: https://img.shields.io/maven-central/v/io.kareldb/kareldb-core.svg\n[maven-link]: https://search.maven.org/#search%7Cga%7C1%7Cio.kareldb\n[javadoc-shield]: https://javadoc.io/badge/io.kareldb/kareldb-core.svg?color=blue\n[javadoc-link]: https://javadoc.io/doc/io.kareldb/kareldb-core\n\nKarelDB is a fully-functional relational database backed by Apache Kafka.\n\n## Maven\n\nReleases of KarelDB are deployed to Maven Central.\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003eio.kareldb\u003c/groupId\u003e\n    \u003cartifactId\u003ekareldb-core\u003c/artifactId\u003e\n    \u003cversion\u003e1.0.0\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n## Server Mode\n\nTo run KarelDB, download a [release](https://github.com/rayokota/kareldb/releases), unpack it, and then modify `config/kareldb.properties` to point to an existing Kafka broker.  Then run the following:\n\n```bash\n$ bin/kareldb-start config/kareldb.properties\n```\n\nAt a separate terminal, enter the following command to start up `sqlline`, a command-line utility for accessing JDBC databases.\n\n```\n$ bin/sqlline\nsqlline version 1.9.0\n\nsqlline\u003e !connect jdbc:avatica:remote:url=http://localhost:8765 admin admin\n\nsqlline\u003e create table books (id int, name varchar, author varchar);\nNo rows affected (0.114 seconds)\n\nsqlline\u003e insert into books values (1, 'The Trial', 'Franz Kafka');\n1 row affected (0.576 seconds)\n\nsqlline\u003e select * from books;\n+----+-----------+-------------+\n| ID |   NAME    |   AUTHOR    |\n+----+-----------+-------------+\n| 1  | The Trial | Franz Kafka |\n+----+-----------+-------------+\n1 row selected (0.133 seconds)\n```\n\nTo access a KarelDB server from a remote application, use an Avatica JDBC client.  A list of Avatica JDBC clients can be found [here](https://calcite.apache.org/avatica/docs/).\n\nIf multiple KarelDB servers are configured with the same cluster group ID (see [Configuration](#configuration)), then they will form a cluster and one of them will be elected as leader, while the others will become followers (replicas).  If a follower receives a request, it will be forwarded to the leader.  If the leader fails, one of the followers will be elected as the new leader.\n\n## Embedded Mode\n\nKarelDB can also be used in embedded mode.  Here is an example:\n\n```java\nProperties properties = new Properties();\nproperties.put(\"schemaFactory\", \"io.kareldb.schema.SchemaFactory\");\nproperties.put(\"parserFactory\", \"org.apache.calcite.sql.parser.parserextension.ExtensionSqlParserImpl#FACTORY\");\nproperties.put(\"schema.kind\", \"io.kareldb.kafka.KafkaSchema\");\nproperties.put(\"schema.kafkacache.bootstrap.servers\", bootstrapServers);\nproperties.put(\"schema.kafkacache.data.dir\", \"/tmp\");\n\ntry (Connection conn = DriverManager.getConnection(\"jdbc:kareldb:\", properties);\n     Statement s = conn.createStatement()) {\n        s.execute(\"create table books (id int, name varchar, author varchar)\");\n        s.executeUpdate(\"insert into books values(1, 'The Trial', 'Franz Kafka')\");\n        ResultSet rs = s.executeQuery(\"select * from books\");\n        ...\n}\n```\n\n## ANSI SQL Support\n\nKarelDB supports ANSI SQL, using [Calcite](https://calcite.apache.org/docs/reference.html).  \n\nWhen creating a table, the primary key constraint should be specified after the columns, like so:\n\n```\nCREATE TABLE customers \n    (id int, name varchar, constraint pk primary key (id));\n```\n\nIf no primary key constraint is specified, the first column in the table will be designated as the primary key.\n\nKarelDB extends Calcite's SQL grammar by adding support for ALTER TABLE commands.\n\n```\nalterTableStatement:\n    ALTER TABLE tableName columnAction [ , columnAction ]*\n    \ncolumnAction:\n    ( ADD tableElement ) | ( DROP columnName )\n```\n\nKarelDB supports the following SQL types:\n\n- boolean\n- integer\n- bigint\n- real\n- double\n- varbinary\n- varchar\n- decimal\n- date\n- time\n- timestamp\n\n## Basic Configuration\n\nKarelDB has a number of configuration properties that can be specified.  When using KarelDB as an embedded database, these properties should be prefixed with `schema.` before passing them to the JDBC driver.\n\n- `listeners` - List of listener URLs that include the scheme, host, and port.  Defaults to `http://0.0.0.0:8765`.  \n- `cluster.group.id` - The group ID to be used for leader election.  Defaults to `kareldb`.\n- `leader.eligibility` - Whether this node can participate in leader election.  Defaults to true.\n- `kafkacache.backing.cache` - The backing cache for KCache, one of `memory` (default), `bdbje`, `lmdb`, `mapdb`, or `rocksdb`.\n- `kafkacache.data.dir` - The root directory for backing cache storage.  Defaults to `/tmp`.\n- `kafkacache.bootstrap.servers` - A list of host and port pairs to use for establishing the initial connection to Kafka.\n- `kafkacache.group.id` - The group ID to use for the internal consumers, which needs to be unique for each node.  Defaults to `kareldb-1`.\n- `kafkacache.topic.replication.factor` - The replication factor for the internal topics created by KarelDB.  Defaults to 3.\n- `kafkacache.init.timeout.ms` - The timeout for initialization of the Kafka cache, including creation of internal topics.  Defaults to 300 seconds.\n- `kafkacache.timeout.ms` - The timeout for an operation on the Kafka cache.  Defaults to 60 seconds.\n\n## Security\n\n### HTTPS\n\nTo use HTTPS, first configure the `listeners` with an `https` prefix, then specify the following properties with the appropriate values.\n\n```\nssl.keystore.location=/var/private/ssl/custom.keystore\nssl.keystore.password=changeme\nssl.key.password=changeme\n```\n\nWhen using the Avatica JDBC client, the `truststore` and `truststore_password` can be passed in the JDBC URL as specified [here](https://calcite.apache.org/avatica/docs/client_reference.html#truststore).\n\n### HTTP Authentication\n\nKarelDB supports both HTTP Basic Authentication and HTTP Digest Authentication, as shown below:\n\n```\nauthentication.method=BASIC  # or DIGEST\nauthentication.roles=admin,developer,user\nauthentication.realm=KarelDb-Props  # as specified in JAAS file\n```\n\nIn the above example, the JAAS file might look like\n\n```\nKarelDb-Props {\n  org.eclipse.jetty.jaas.spi.PropertyFileLoginModule required\n  file=\"/path/to/password-file\"\n  debug=\"false\";\n};\n```\n\nThe `ProperyFileLoginModule` can be replaced with other implementations, such as `LdapLoginModule` or `JDBCLoginModule`.\n\nWhen starting KarelDB, the path to the JAAS file must be set as a system property.\n\n```bash\n$ export KARELDB_OPTS=-Djava.security.auth.login.config=/path/to/the/jaas_config.file\n$ bin/kareldb-start config/kareldb-secure.properties\n```\n\nWhen using the Avatica JDBC client, the `avatica_user` and `avatica_password` can be passed in the JDBC URL as specified [here](https://calcite.apache.org/avatica/docs/client_reference.html#avatica-user).\n\n### Kafka Authentication\n\nAuthentication to a secure Kafka cluster is described [here](https://github.com/rayokota/kcache#security).\n \n## Implementation Notes\n\nKarelDB stores table data in topics of the form `{tableName}_{generation}`.  A different generation ID is used whenever a table is dropped and re-created.\n\nKarelDB uses three topics to hold metadata:\n\n- `_tables` - A topic that holds the schemas for tables.\n- `_commits` - A topic that holds the list of committed transactions.\n- `_timestamps` - A topic that stores the maximum timestamp that the transaction manager is allowed to return to clients.\n\n## Database by Components\n\nKarelDB is an example of a database built mostly by assembling pre-existing components.  In particular, KarelDB uses the following:\n\n- [Apache Kafka](https://kafka.apache.org) - for persistence, using [KCache](https://github.com/rayokota/kcache) as an embedded key-value store\n- [Apache Avro](https://avro.apache.org) - for serialization and schema evolution\n- [Apache Calcite](https://calcite.apache.org) - for SQL parsing, optimization, and execution\n- [Apache Omid](https://omid.incubator.apache.org) - for transaction management and MVCC support\n- [Apache Avatica](https://calcite.apache.org/avatica/) - for JDBC functionality\n\nSee this [blog](https://yokota.blog/2019/09/23/building-a-relational-database-using-kafka) for more on the design of KarelDB.\n\n## Future Enhancements \n\nPossible future enhancements include support for secondary indices.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frayokota%2Fkareldb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frayokota%2Fkareldb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frayokota%2Fkareldb/lists"}