{"id":13514442,"url":"https://github.com/Aiven-Open/karapace","last_synced_at":"2025-03-31T03:30:54.039Z","repository":{"id":37390820,"uuid":"164508278","full_name":"Aiven-Open/karapace","owner":"Aiven-Open","description":"Karapace - Your Apache Kafka® essentials in one tool","archived":false,"fork":false,"pushed_at":"2024-04-12T11:19:48.000Z","size":16516,"stargazers_count":394,"open_issues_count":76,"forks_count":62,"subscribers_count":82,"default_branch":"main","last_synced_at":"2024-04-13T20:50:30.938Z","etag":null,"topics":["kafka","rest-proxy","schema-registry"],"latest_commit_sha":null,"homepage":"https://karapace.io","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Aiven-Open.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2019-01-07T22:39:05.000Z","updated_at":"2024-04-15T08:46:36.613Z","dependencies_parsed_at":"2023-12-20T13:16:19.400Z","dependency_job_id":"447faf7d-7bb8-46b8-a1fd-208df1ee7fa1","html_url":"https://github.com/Aiven-Open/karapace","commit_stats":null,"previous_names":["aiven-open/karapace","aiven/karapace"],"tags_count":55,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aiven-Open%2Fkarapace","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aiven-Open%2Fkarapace/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aiven-Open%2Fkarapace/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Aiven-Open%2Fkarapace/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Aiven-Open","download_url":"https://codeload.github.com/Aiven-Open/karapace/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246413377,"owners_count":20773053,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["kafka","rest-proxy","schema-registry"],"created_at":"2024-08-01T05:00:56.383Z","updated_at":"2025-03-31T03:30:49.028Z","avatar_url":"https://github.com/Aiven-Open.png","language":"HTML","readme":"Karapace\n========\n\n``karapace``. Your Apache Kafka® essentials in one tool.\n\nAn `open-source \u003chttps://github.com/Aiven-Open/karapace/blob/master/LICENSE\u003e`_ implementation\nof `Kafka REST \u003chttps://docs.confluent.io/platform/current/kafka-rest/index.html#features\u003e`_ and\n`Schema Registry \u003chttps://docs.confluent.io/platform/current/schema-registry/index.html\u003e`_.\n\n|Tests| |Contributor Covenant|\n\n.. |Tests| image:: https://github.com/Aiven-Open/karapace/actions/workflows/tests.yml/badge.svg?branch=main\n    :target: https://github.com/Aiven-Open/karapace/actions/workflows/tests.yml?query=branch%3Amain\n\n.. |Contributor Covenant| image:: https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg\n    :target: CODE_OF_CONDUCT.md\n\nOverview\n========\n\nKarapace supports the storing of schemas in a central repository, which clients can access to\nserialize and deserialize messages. The schemas also maintain their own version histories and can be\nchecked for compatibility between their different respective versions.\n\nKarapace rest provides a RESTful interface to your Apache Kafka cluster, allowing you to perform tasks such\nas producing and consuming messages and perform administrative cluster work, all the while using the\nlanguage of the WEB.\n\nFeatures\n========\n\n* Drop-in replacement both on pre-existing Schema Registry / Kafka Rest Proxy client and\n  server-sides\n* Moderate memory consumption\n* Asynchronous architecture based on aiohttp\n* Supports Avro, JSON Schema, and Protobuf\n* Leader/Replica architecture for HA and load balancing\n\nCompatibility details\n---------------------\n\nKarapace is compatible with Schema Registry 6.1.1 on API level. When a new version of SR is released, the goal is\nto support it in a reasonable time. Karapace supports all operations in the API.\n\nThere are some caveats regarding the schema normalization, and the error messages being the same as in Schema Registry, which\ncannot be always fully guaranteed.\n\nSetup\n=====\n\nUsing Docker\n------------\n\nTo get you up and running with the latest build of Karapace, a docker image is available::\n\n  # Fetch the latest build from main branch\n  docker pull ghcr.io/aiven-open/karapace:develop\n\n  # Fetch the latest release\n  docker pull ghcr.io/aiven-open/karapace:latest\n\nVersions `3.7.1` and earlier are available from the `ghcr.io/aiven` registry::\n\n  docker pull ghcr.io/aiven/karapace:3.7.1\n\nAn example setup including configuration and Kafka connection is available as compose example::\n\n    docker compose -f ./container/compose.yml up -d\n\nThen you should be able to reach two sets of endpoints:\n\n* Karapace schema registry on http://localhost:8081\n* Karapace REST on http://localhost:8082\n\nConfiguration\n^^^^^^^^^^^^^\n\nEach configuration key can be overridden with an environment variable prefixed with ``KARAPACE_``,\nexception being configuration keys that actually start with the ``karapace`` string. For example, to\noverride the ``bootstrap_uri`` config value, one would use the environment variable\n``KARAPACE_BOOTSTRAP_URI``. Here_ you can find an example configuration file to give you an idea\nwhat you need to change.\n\n.. _`Here`: https://github.com/Aiven-Open/karapace/blob/master/karapace.config.json\n\nSource install\n--------------\n\nAlternatively you can do a source install using::\n\n  pip install .\n\nQuickstart\n==========\n\nTo register the first version of a schema under the subject \"test\" using Avro schema::\n\n  $ curl -X POST -H \"Content-Type: application/vnd.schemaregistry.v1+json\" \\\n    --data '{\"schema\": \"{\\\"type\\\": \\\"record\\\", \\\"name\\\": \\\"Obj\\\", \\\"fields\\\":[{\\\"name\\\": \\\"age\\\", \\\"type\\\": \\\"int\\\"}]}\"}' \\\n    http://localhost:8081/subjects/test-key/versions\n  {\"id\":1}\n\nTo register a version of a schema using JSON Schema, one needs to use `schemaType` property::\n\n  $ curl -X POST -H \"Content-Type: application/vnd.schemaregistry.v1+json\" \\\n    --data '{\"schemaType\": \"JSON\", \"schema\": \"{\\\"type\\\": \\\"object\\\",\\\"properties\\\":{\\\"age\\\":{\\\"type\\\": \\\"number\\\"}},\\\"additionalProperties\\\":true}\"}' \\\n    http://localhost:8081/subjects/test-key-json-schema/versions\n  {\"id\":2}\n\nTo list all subjects (including the one created just above)::\n\n  $ curl -X GET http://localhost:8081/subjects\n  [\"test-key\"]\n\nTo list all the versions of a given schema (including the one just created above)::\n\n  $ curl -X GET http://localhost:8081/subjects/test-key/versions\n  [1]\n\nTo fetch back the schema whose global id is 1 (i.e. the one registered above)::\n\n  $ curl -X GET http://localhost:8081/schemas/ids/1\n  {\"schema\":\"{\\\"fields\\\":[{\\\"name\\\":\\\"age\\\",\\\"type\\\":\\\"int\\\"}],\\\"name\\\":\\\"Obj\\\",\\\"type\\\":\\\"record\\\"}\"}\n\nTo get the specific version 1 of the schema just registered run::\n\n  $ curl -X GET http://localhost:8081/subjects/test-key/versions/1\n  {\"subject\":\"test-key\",\"version\":1,\"id\":1,\"schema\":\"{\\\"fields\\\":[{\\\"name\\\":\\\"age\\\",\\\"type\\\":\\\"int\\\"}],\\\"name\\\":\\\"Obj\\\",\\\"type\\\":\\\"record\\\"}\"}\n\nTo get the latest version of the schema under subject test-key run::\n\n  $ curl -X GET http://localhost:8081/subjects/test-key/versions/latest\n  {\"subject\":\"test-key\",\"version\":1,\"id\":1,\"schema\":\"{\\\"fields\\\":[{\\\"name\\\":\\\"age\\\",\\\"type\\\":\\\"int\\\"}],\\\"name\\\":\\\"Obj\\\",\\\"type\\\":\\\"record\\\"}\"}\n\nIn order to delete version 10 of the schema registered under subject \"test-key\" (if it exists)::\n\n  $ curl -X DELETE http://localhost:8081/subjects/test-key/versions/10\n   10\n\nTo Delete all versions of the schema registered under subject \"test-key\"::\n\n  $ curl -X DELETE http://localhost:8081/subjects/test-key\n  [1]\n\nTest the compatibility of a schema with the latest schema under subject \"test-key\"::\n\n  $ curl -X POST -H \"Content-Type: application/vnd.schemaregistry.v1+json\" \\\n    --data '{\"schema\": \"{\\\"type\\\": \\\"int\\\"}\"}' \\\n    http://localhost:8081/compatibility/subjects/test-key/versions/latest\n  {\"is_compatible\":true}\n\nNOTE: if the subject's compatibility mode is transitive (BACKWARD_TRANSITIVE, FORWARD_TRANSITIVE or FULL_TRANSITIVE) then the\ncompatibility is checked not only against the latest schema, but also against all previous schemas, as it would be done\nwhen trying to register the new schema through the `subjects/\u003csubject-key\u003e/versions` endpoint.\n\nGet current global backwards compatibility setting value::\n\n  $ curl -X GET http://localhost:8081/config\n  {\"compatibilityLevel\":\"BACKWARD\"}\n\nChange compatibility requirements for all subjects where it's not\nspecifically defined otherwise::\n\n  $ curl -X PUT -H \"Content-Type: application/vnd.schemaregistry.v1+json\" \\\n    --data '{\"compatibility\": \"NONE\"}' http://localhost:8081/config\n  {\"compatibility\":\"NONE\"}\n\nChange compatibility requirement to FULL for the test-key subject::\n\n  $ curl -X PUT -H \"Content-Type: application/vnd.schemaregistry.v1+json\" \\\n    --data '{\"compatibility\": \"FULL\"}' http://localhost:8081/config/test-key\n  {\"compatibility\":\"FULL\"}\n\nList topics::\n\n  $ curl \"http://localhost:8082/topics\"\n\nGet info for one particular topic::\n\n  $ curl \"http://localhost:8082/topics/my_topic\"\n\nProduce a message backed up by schema registry::\n\n  $ curl -H \"Content-Type: application/vnd.kafka.avro.v2+json\" -X POST -d \\\n    '{\"value_schema\": \"{\\\"namespace\\\": \\\"example.avro\\\", \\\"type\\\": \\\"record\\\", \\\"name\\\": \\\"simple\\\", \\\"fields\\\": \\\n    [{\\\"name\\\": \\\"name\\\", \\\"type\\\": \\\"string\\\"}]}\", \"records\": [{\"value\": {\"name\": \"name0\"}}]}' http://localhost:8082/topics/my_topic\n\nCreate a consumer::\n\n  $ curl -X POST -H \"Content-Type: application/vnd.kafka.v2+json\" -H \"Accept: application/vnd.kafka.v2+json\" \\\n    --data '{\"name\": \"my_consumer\", \"format\": \"avro\", \"auto.offset.reset\": \"earliest\"}' \\\n    http://localhost:8082/consumers/avro_consumers\n\nSubscribe to the topic we previously published to::\n\n  $ curl -X POST -H \"Content-Type: application/vnd.kafka.v2+json\" --data '{\"topics\":[\"my_topic\"]}' \\\n    http://localhost:8082/consumers/avro_consumers/instances/my_consumer/subscription\n\nConsume previously published message::\n\n  $ curl -X GET -H \"Accept: application/vnd.kafka.avro.v2+json\" \\\n    http://localhost:8082/consumers/avro_consumers/instances/my_consumer/records?timeout=1000\n\nCommit offsets for a particular topic partition::\n\n  $ curl -X POST -H \"Content-Type: application/vnd.kafka.v2+json\" --data '{}' \\\n    http://localhost:8082/consumers/avro_consumers/instances/my_consumer/offsets\n\nDelete consumer::\n\n  $ curl -X DELETE -H \"Accept: application/vnd.kafka.v2+json\" \\\n    http://localhost:8082/consumers/avro_consumers/instances/my_consumer\n\nBacking up your Karapace\n========================\n\nKarapace natively stores its data in a Kafka topic the name of which you can\nconfigure freely but which by default is called _schemas.\n\nKarapace includes a tool to backing up and restoring data. To back up, run::\n\n  karapace_schema_backup get --config karapace.config.json --location schemas.log\n\nYou can also back up the data by using Kafka's Java console\nconsumer::\n\n  ./kafka-console-consumer.sh --bootstrap-server brokerhostname:9092 --topic _schemas --from-beginning --property print.key=true --timeout-ms 1000 1\u003e schemas.log\n\nRestoring Karapace from backup\n==============================\n\nYour backup can be restored with Karapace by running::\n\n  karapace_schema_backup restore --config karapace.config.json --location schemas.log\n\nOr Kafka's Java console producer can be used to restore the data\nto a new Kafka cluster.\n\nYou can restore the data from the previous step by running::\n\n  ./kafka-console-producer.sh --broker-list brokerhostname:9092 --topic _schemas --property parse.key=true \u003c schemas.log\n\nPerformance comparison to Confluent stack\n==========================================\nLatency\n-------\n\n* 50 concurrent connections, 50.000 requests\n\n====== ========== ===========\nFormat  Karapace   Confluent\n====== ========== ===========\nAvro    80.95      7.22\nBinary  66.32      46.99\nJson    60.36      53.7\n====== ========== ===========\n\n* 15 concurrent connections, 50.000 requests\n\n====== =========== ===========\nFormat   Karapace   Confluent\n====== =========== ===========\nAvro     25.05      18.14\nBinary   21.35      15.85\nJson     21.38      14.83\n====== =========== ===========\n\n* 4 concurrent connections, 50.000 requests\n\n====== =========== ===========\nFormat  Karapace   Confluent\n====== =========== ===========\nAvro     6.54        5.67\nBinary   6.51        4.56\nJson     6.86        5.32\n====== =========== ===========\n\n\nAlso, it appears there is quite a bit of variation on subsequent runs, especially for the lower numbers, so once\nmore exact measurements are required, it's advised we increase the total req count to something like 500K\n\nWe'll focus on Avro serialization only after this round, as it's the more expensive one, plus it tests the entire stack\n\nConsuming RAM\n-------------\n\nA basic push pull test , with 12 connections on the publisher process and 3 connections on the subscriber process, with a\n10 minute duration. The publisher has the 100 ms timeout and 100 max_bytes parameters set on each request so both processes have work to do\nHeap size limit is set to 256M on Rest proxy\n\nRam consumption, different consumer count, over 300s\n\n=========== =================== ================\n Consumers   Karapace combined   Confluent rest\n=========== =================== ================\n    1            47                  200\n    10           55                  400\n    20           83                  530\n=========== =================== ================\n\nCommands\n========\n\nOnce installed, the ``karapace`` program should be in your path.  It is the\nmain daemon process that should be run under a service manager such as\n``systemd`` to serve clients.\n\nConfiguration keys\n==================\n\nKeys to take special care are the ones needed to configure Kafka and advertised_hostname.\n\n.. list-table::\n   :header-rows: 1\n\n   * - Parameter\n     - Default Value\n     - Description\n   * - ``http_request_max_size``\n     - ``1048576``\n     - The maximum client HTTP request size. This value controls how large (POST) payloads are allowed. When configuration of ``karapace_rest`` is set to `true` and ``http_request_max_size`` is not set, Karapace configuration adapts the allowed client max size from the ``producer_max_request_size``. In cases where automatically selected size is not enough the configuration can be overridden by setting a value in configuration. For schema registry operation set the client max size according to expected size of schema payloads if default size is not enough.\n   * - ``advertised_protocol``\n     - ``http``\n     - The protocol being advertised to other instances of Karapace that are attached to the same Kafka group.\n   * - ``advertised_hostname``\n     - ``socket.gethostname()``\n     - The hostname being advertised to other instances of Karapace that are attached to the same Kafka group.  All nodes within the cluster need to have their ``advertised_hostname``'s set so that they can all reach each other.\n   * - ``advertised_port``\n     - ``None``\n     - The port being advertised to other instances of Karapace that are attached to the same Kafka group.  Fallbacks to ``port`` if not set.\n   * - ``bootstrap_uri``\n     - ``localhost:9092``\n     - The URI to the Kafka service where to store the schemas and to run\n       coordination among the Karapace instances.\n   * - ``sasl_bootstrap_uri``\n     - ``None``\n     - The URI to the Kafka service to use with the Kafka REST API when SASL authorization with REST is used.\n   * - ``client_id``\n     - ``sr-1``\n     - The ``client_id`` Karapace will use when coordinating with\n       other Karapace instances. The instance with the ID that sorts\n       first alphabetically is chosen as master from the services with\n       master_eligibility set to true.\n   * - ``consumer_enable_autocommit``\n     - ``True``\n     - Enable auto commit on rest proxy consumers\n   * - ``consumer_request_timeout_ms``\n     - ``11000``\n     - Rest proxy consumers timeout for reads that do not limit the max bytes or provide their own timeout\n   * - ``consumer_request_max_bytes``\n     - ``67108864``\n     - Rest proxy consumers maximum bytes to be fetched per request\n   * - ``consumer_idle_disconnect_timeout``\n     - ``0``\n     - Disconnect idle consumers after timeout seconds if not used.  Inactivity leads to consumer leaving consumer group and consumer state.  0 (default) means no auto-disconnect.\n   * - ``fetch_min_bytes``\n     - ``1``\n     - Rest proxy consumers minimum bytes to be fetched per request.\n   * - ``group_id``\n     - ``schema-registry``\n     - The Kafka group name used for selecting a master service to coordinate the storing of Schemas.\n   * - ``master_eligibility``\n     - ``true``\n     - Should the service instance be considered for promotion to the master\n       service. One reason to turn this off would be to have an instance of Karapace\n       running somewhere else for HA purposes but which you wouldn't want to\n       automatically promote to master if the primary instances become\n       unavailable.\n   * - ``producer_compression_type``\n     - ``None``\n     - Type of compression to be used by rest proxy producers\n   * - ``producer_acks``\n     - ``1``\n     - Level of consistency desired by each producer message sent on the rest proxy.\n       More on `Kafka Producer \u003chttps://kafka.apache.org/10/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html\u003e`_\n   * - ``producer_linger_ms``\n     - ``0``\n     - Time to wait for grouping together requests.\n       More on `Kafka Producer \u003chttps://kafka.apache.org/10/javadoc/org/apache/kafka/clients/producer/KafkaProducer.html\u003e`_\n   * - ``producer_max_request_size``\n     - ``1048576``\n     - The maximum size of a request in bytes.\n       More on `Kafka Producer configs \u003chttps://kafka.apache.org/documentation/#producerconfigs_max.request.size\u003e`_\n   * - ``security_protocol``\n     - ``PLAINTEXT``\n     - Default Kafka security protocol needed to communicate with the Kafka\n       cluster.  Other options is to use SSL for SSL client certificate\n       authentication.\n   * - ``sentry``\n     - ``None``\n     - Used to configure parameters for sentry integration (dsn, tags, ...). Setting the\n       environment variable ``SENTRY_DSN`` will also enable sentry integration.\n   * - ``ssl_cafile``\n     - ``/path/to/cafile``\n     - Used when ``security_protocol`` is set to SSL, the path to the SSL CA certificate.\n   * - ``ssl_certfile``\n     - ``/path/to/certfile``\n     - Used when ``security_protocol`` is set to SSL, the path to the SSL certfile.\n   * - ``ssl_keyfile``\n     - ``/path/to/keyfile``\n     - Used when ``security_protocol`` is set to SSL, the path to the SSL keyfile.\n   * - ``topic_name``\n     - ``_schemas``\n     - The name of the Kafka topic where to store the schemas.\n   * - ``replication_factor``\n     - ``1``\n     - The replication factor to be used with the schema topic.\n   * - ``host``\n     - ``127.0.0.1``\n     - Listening host for the Karapace server.  Use an empty string to\n       listen to all available networks.\n   * - ``port``\n     - ``8081``\n     - Listening port for the Karapace server.\n   * - ``server_tls_certfile``\n     - ``/path/to/certfile``\n     - Filename to a certificate chain for the Karapace server in HTTPS mode.\n   * - ``server_tls_keyfile``\n     - ``/path/to/keyfile``\n     - Filename to a private key for the Karapace server in HTTPS mode.\n   * - ``registry_host``\n     - ``127.0.0.1``\n     - Schema Registry host, used by Kafka Rest for schema related requests.\n       If running both in the same process, it should be left to its default value\n   * - ``registry_port``\n     - ``8081``\n     - Schema Registry port, used by Kafka Rest for schema related requests.\n       If running both in the same process, it should be left to its default value\n   * - ``registry_user``\n     - ``None``\n     - Schema Registry user for authentication, used by Kafka Rest for schema related requests.\n   * - ``registry_password``\n     - ``None``\n     - Schema Registry password for authentication, used by Kafka Rest for schema related requests.\n   * - ``registry_ca``\n     - ``/path/to/cafile``\n     - Kafka Registry CA certificate, used by Kafka Rest for Avro related requests.\n       If this is set, Kafka Rest will use HTTPS to connect to the registry.\n       If running both in the same process, it should be left to its default value\n   * - ``registry_authfile``\n     - ``/path/to/authfile.json``\n     - Filename to specify users and access control rules for Karapace Schema Registry.\n       If this is set, Schema Segistry requires authentication for most of the endpoints and applies per endpoint authorization rules.\n   * - ``rest_authorization``\n     - ``false``\n     - Use REST API's calling authorization credentials to invoke Kafka operations over SASL authentication of ``sasl_bootstrap_uri`` to delegate REST proxy authorization to Kafka.  If false, then use configured common credentials for all Kafka connections of REST proxy operations.\n   * - ``rest_base_uri``\n     - ``None``\n     - Publicly available URI of this instance advertised to the clients using stateful operations such as creating consumers.  If not set, then construct URI using ``advertised_protocol``, ``advertised_hostname``, and ``advertised_port``.\n   * - ``metadata_max_age_ms``\n     - ``60000``\n     - Period of time in milliseconds after Kafka metadata is force refreshed.\n   * - ``karapace_rest``\n     - ``true``\n     - If the rest part of the app should be included in the starting process\n       At least one of this and ``karapace_registry`` options need to be enabled in order\n       for the service to start\n   * - ``karapace_registry``\n     - ``true``\n     - If the registry part of the app should be included in the starting process\n       At least one of this and ``karapace_rest`` options need to be enabled in order\n       for the service to start\n   * - ``protobuf_runtime_directory``\n     - ``runtime``\n     - Runtime directory for the ``protoc`` protobuf schema parser and code generator\n   * - ``name_strategy``\n     - ``topic_name``\n     - Name strategy to use when storing schemas from the kafka rest proxy service. You can opt between ``topic_name`` , ``record_name`` and ``topic_record_name``\n   * - ``name_strategy_validation``\n     - ``true``\n     - If enabled, validate that given schema is registered under used name strategy when producing messages from Kafka Rest\n   * - ``master_election_strategy``\n     - ``lowest``\n     - Decides on what basis the Karapace cluster master is chosen (only relevant in a multi node setup)\n   * - ``kafka_schema_reader_strict_mode``\n     - ``false``\n     - If enabled, causes the Karapace schema-registry service to shutdown when there are invalid schema records in the `_schemas` topic\n   * - ``kafka_retriable_errors_silenced``\n     - ``true``\n     - If enabled, kafka errors which can be retried or custom errors specififed for the service will not be raised,\n       instead, a warning log is emitted. This will denoise issue tracking systems, i.e. sentry\n   * - ``use_protobuf_formatter``\n     - ``false``\n     - If protobuf formatter should be used on protobuf schemas in order to normalize schemas. The formatter is used on top and independent of regular normalization and schemas will be persisted in a formatted state.\n   * - ``log_handler``\n     - ``stdout``\n     - Select the log handler. Default is standard output. Alternative log handler is ``systemd``.\n   * - ``log_level``\n     - ``DEBUG``\n     - Logging level. Default level is debug.\n   * - ``log_format``\n     - ``%(name)-20s\\t%(threadName)s\\t%(levelname)-8s\\t%(message)s``\n     - Log format\n\n\nAuthentication and authorization of Karapace Schema Registry REST API\n=====================================================================\n\nTo enable HTTP Basic Authentication and user authorization the authorization configuration file is set in the main configuration key ``registry_authfile`` of the Karapace.\n\nKarapace Schema Registry authorization file is an optional JSON configuration, which contains a list of authorized users in ``users`` and a list of access control rules in ``permissions``.\n\nEach user entry contains following attributes:\n\n.. list-table::\n   :header-rows: 1\n\n   * - Parameter\n     - Description\n   * - ``username``\n     - A string\n   * - ``algorithm``\n     - One of supported hashing algorithms, ``scrypt``, ``sha1``, ``sha256``, or ``sha512``\n   * - ``salt``\n     - Salt used for hashing the password\n   * - ``password_hash``\n     - Hash string of the password calculated using given algorithm and salt.\n\nPassword hashing can be done using ``karapace_mkpasswd`` tool, if installed, or by invoking directly with ``python -m karapace.auth``. The tool generates JSON entry with these fields. ::\n\n  $ karapace_mkpasswd -u user -a sha512 secret\n  {\n      \"username\": \"user\",\n      \"algorithm\": \"sha512\",\n      \"salt\": \"iuLouaExTeg9ypqTxqP-dw\",\n      \"password_hash\": \"R6ghYSXdLGsq6hkQcg8wT4xkD4QToxBhlp7NerTnyB077M+mD2qiN7ZxXCDb4aE+5lExu5P11UpMPYAcVYxSQA==\"\n  }\n\nEach access control rule contains following attributes:\n\n.. list-table::\n   :header-rows: 1\n\n   * - Parameter\n     - Description\n   * - ``username``\n     - A string to match against authenticated user\n   * - ``operation``\n     - Exact value of ``Read`` or ``Write``. Write implies also read permissions. Write includes all mutable operations, e.g. deleting schema versions\n   * - ``resource``\n     - A regular expression used to match against accessed resource.\n\nSupported resource authorization:\n\n.. list-table::\n   :header-rows: 1\n\n   * - Resource\n     - Description\n   * - ``Config:``\n     - Controls authorization to global schema registry configuration.\n   * - ``Subject:\u003csubject_name\u003e``\n     - Controls authorization to subject. The ``\u003csubject_name\u003e`` is a regular expression to match against the accessed subject.\n\nExample of complete authorization file\n--------------------------------------\n\n::\n\n    {\n        \"users\": [\n            {\n                \"username\": \"admin\",\n                \"algorithm\": \"scrypt\",\n                \"salt\": \"\u003cput salt for randomized hashing here\u003e\",\n                \"password_hash\": \"\u003cput hashed password here\u003e\"\n            },\n            {\n                \"username\": \"plainuser\",\n                \"algorithm\": \"sha256\",\n                \"salt\": \"\u003cput salt for randomized hashing here\u003e\",\n                \"password_hash\": \"\u003cput hashed password here\u003e\"\n            }\n        ],\n        \"permissions\": [\n            {\n                \"username\": \"admin\",\n                \"operation\": \"Write\",\n                \"resource\": \".*\"\n            },\n            {\n                \"username\": \"plainuser\",\n                \"operation\": \"Read\",\n                \"resource\": \"Subject:general.*\"\n            },\n            {\n                \"username\": \"plainuser\",\n                \"operation\": \"Read\",\n                \"resource\": \"Config:\"\n            }\n        ]\n    }\n\nKarapace Schema Registry access to the schemas topic\n====================================================\n\nThe principal used by the Karapace Schema Registry has to have adequate access to the schemas topic (see the ``topic_name`` configuration option above).\nIn addition to what is required to access the topic, as described in the Confluent Schema Registry documentation_, the unique, single-member consumer group\nused by consumers in the schema registry needs ``Describe`` and ``Read`` permissions_ on the group.\nThese unique (per instance of the schema registry) consumer group names are prefixed by ``karapace-autogenerated-``, followed by a random string.\n\n.. _`documentation`: https://docs.confluent.io/platform/current/schema-registry/security/index.html#authorizing-access-to-the-schemas-topic\n.. _`permissions`: https://docs.confluent.io/platform/current/kafka/authorization.html#group-resource-type-operations\n\nOAuth2 authentication and authorization of Karapace REST proxy\n===================================================================\n\nThe Karapace REST proxy supports passing OAuth2 credentials to the underlying Kafka service (defined in the ``sasl_bootstrap_uri`` configuration parameter). The JSON Web Token (JWT) is extracted from the ``Authorization`` HTTP header if the authorization scheme is ``Bearer``,\neg. ``Authorization: Bearer $JWT``. If a ``Bearer`` token is present, the Kafka clients managed by Karapace will be created to use the SASL ``OAUTHBEARER`` mechanism and the JWT will be passed along. The Karapace REST proxy does not verify the token, that is done by\nthe underlying Kafka service itself, if it's configured accordingly.\n\nAuthorization is also done by Kafka itself, typically using the ``sub`` claim (although it's configurable) from the JWT as the username, checked against the configured ACLs.\n\nOAuth2 and ``Bearer`` token usage is dependent on the ``rest_authorization`` configuration parameter being ``true``.\n\nToken expiry\n------------\n\nThe REST proxy process manages a set of producer and consumer clients, which are identified by the OAuth2 JWT token. These are periodically cleaned up if they are idle, as well as *before* the JWT token expires (the clean up currently runs every 5 minutes).\n\nBefore a client refreshes its OAuth2 JWT token, it is expected to remove currently running consumers (eg. after committing their offsets) and producers using the current token.\n\nSchema Normalization\n--------------------\n\nIf specified as a rest parameter for the POST ``/subjects/{subject}/versions?normalize=true`` endpoint and the POST ``subjects/{subject}?normalize=true`` endpoint,\nKarapace uses a schema normalization algorithm to ensure that the schema is stored in a canonical form.\n\nThis normalization process is done so that schemas semantically equivalent are stored in the same way and should be considered equal.\n\nNormalization is currently only supported for Protobuf schemas. Karapace does not support all normalization features implemented by Confluent Schema Registry.\nCurrently the normalization process is done only for the ordering of the optional fields in the schema.\nUse the feature with the assumption that it will be extended in the future and so two schemas that are semantically equivalent could be considered\ndifferent by the normalization process in different future versions of Karapace.\nThe safe choice, when using a normalization process, is always to consider as different two schemas that are semantically equivalent while the problem is when two semantically different schemas are considered equivalent.\nIn that view the future extension of the normalization process isn't considered a breaking change but rather an extension of the normalization process.\n\n\nUninstall\n=========\n\nTo unistall Karapace from the system you can follow the instructions described below. We would love to hear your reasons for uninstalling though. Please file an issue if you experience any problems or email us_ with feedback\n\n.. _`us`: mailto:opensource@aiven.io\n\n\nInstalled via Docker\n--------------------\n\nIf you installed Karapace via Docker, you would need to first stop and remove the images like described:\n\nFirst obtain the container IDs related to Karapace, you should have one for the registry itself and another one for the rest interface::\n\n    docker ps | grep karapace\n\nAfter this, you can stop each of the containers with::\n\n    docker stop \u003cCONTAINER_ID\u003e\n\nIf you don't need or want to have the Karapace images around you can now proceed to delete them using::\n\n    docker rm \u003cCONTAINER_ID\u003e\n\nInstalled from Sources\n----------------------\n\nKarapace is installed ``pip install .``, it can be uninstalled with the following ``pip`` command::\n\n    pip uninstall karapace\n\nDevelopment\n===========\n\nExecute ``make`` (GNU, usually ``gmake`` on BSD and Mac) to set up a ``venv``\nand install the required software for development. Use ``make unit-tests`` and\n``make integration-tests`` to execute the respective test suite, or simply\n``make test`` to execute both. You can set ``PYTEST_ARGS`` to customize the\nexecution (e.g. ``PYTEST_ARGS=--maxfail=1 make test``).\n\nKarapace currently depends on various system software to be installed. The\ninstallation of these is automated for some operation systems, but not all. At\nthe time of writing Java, the Protobuf Compiler, and the Snappy shared library\nare required to work with Karapace. You need to install them manually if your\noperating system is not supported by the automatic installation scripts. Note\nthat the scripts are going to ask before installing any of these on your system.\n\nNote that Karapace requires a Protobuf Compiler older than 3.20.0, because\n3.20.0 introduces various breaking changes. The tests are going to fail if the\nProtobuf Compiler is newer than that. However, you can work around this locally\nby running ``pip install --upgrade protobuf`` in your venv. We are going to fix\nthis soon.\n\nNote that the integration tests are currently not working on Mac. You can use\nDocker, just be sure to set ``VENV_DIR`` to a directory outside the working\ndirectory so that the container is not overwriting files from the host (e.g.\n``docker run --env VENV_DIR=/tmp/venv ...``).\n\nNote that the ``runtime`` directory **MUST** exist and that Karapace is going to\nfail if it does not. The ``runtime`` directory is also not cleaned between test\nruns, and left over data might result in failing tests. Use the ``make`` test\ntargets that correctly clean the ``runtime`` directory without deleting it, but\nkeep this in mind whenever you are not using ``make`` (e.g. running tests from\nyour IDE).\n\nUse ``pipx`` or ``brew`` to install ``pre-commit`` and use the global installation,\nthere is also no dependency on it.\n\nLicense\n=======\n\nKarapace is licensed under the Apache license, version 2.0.  Full license text is\navailable in the ``LICENSE`` file.\n\nPlease note that the project explicitly does not require a CLA (Contributor\nLicense Agreement) from its contributors.\n\nContact\n=======\n\nBug reports and patches are very welcome, please post them as GitHub issues\nand pull requests at https://github.com/Aiven-Open/karapace .  Any possible\nvulnerabilities or other serious issues should be reported directly to the\nmaintainers \u003copensource@aiven.io\u003e.\n\nTrademark\n=========\nApache Kafka is either a registered trademark or trademark of the Apache Software Foundation in the United States and/or other countries. Kafka Rest and Schema Registry are trademarks and property of their respective owners. All product and service names used in this page are for identification purposes only and do not imply endorsement.\n\nCredits\n=======\n\nKarapace was created by, and is maintained by, Aiven_ cloud data hub\ndevelopers.\n\nThe schema storing part of Karapace loans heavily from the ideas of the\nearlier Schema Registry implementation by Confluent and thanks are in order\nto them for pioneering the concept.\n\n.. _`Aiven`: https://aiven.io/\n\nRecent contributors are listed on the GitHub project page,\nhttps://github.com/Aiven-Open/karapace/graphs/contributors\n\nCopyright ⓒ 2021 Aiven Ltd.\n","funding_links":[],"categories":["HTML","Kafka","Schema Registry"],"sub_categories":["Infrastructure from code","Implementations"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAiven-Open%2Fkarapace","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FAiven-Open%2Fkarapace","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAiven-Open%2Fkarapace/lists"}