{"id":19126296,"url":"https://github.com/santanusinha/dropwizard-db-sharding-bundle","last_synced_at":"2025-04-06T01:07:34.636Z","repository":{"id":5815263,"uuid":"53728911","full_name":"santanusinha/dropwizard-db-sharding-bundle","owner":"santanusinha","description":"A bundle for implementing application level sharding on traditional databases.","archived":false,"fork":false,"pushed_at":"2024-12-30T19:15:38.000Z","size":1272,"stargazers_count":25,"open_issues_count":14,"forks_count":53,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-30T00:06:34.203Z","etag":null,"topics":["dropwizard","dropwizard-bundle","java","mariadb","mysql"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/santanusinha.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-03-12T11:50:53.000Z","updated_at":"2025-03-07T10:29:55.000Z","dependencies_parsed_at":"2024-09-18T06:43:27.547Z","dependency_job_id":"2e4c2e86-38b0-458c-a747-27e249fd012c","html_url":"https://github.com/santanusinha/dropwizard-db-sharding-bundle","commit_stats":null,"previous_names":[],"tags_count":39,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/santanusinha%2Fdropwizard-db-sharding-bundle","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/santanusinha%2Fdropwizard-db-sharding-bundle/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/santanusinha%2Fdropwizard-db-sharding-bundle/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/santanusinha%2Fdropwizard-db-sharding-bundle/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/santanusinha","download_url":"https://codeload.github.com/santanusinha/dropwizard-db-sharding-bundle/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247419860,"owners_count":20936012,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dropwizard","dropwizard-bundle","java","mariadb","mysql"],"created_at":"2024-11-09T05:39:03.391Z","updated_at":"2025-04-06T01:07:34.603Z","avatar_url":"https://github.com/santanusinha.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Dropwizard DB Sharding Bundle\n\nApplication level sharding for traditional relational databases.\nApache Licensed\n\n## Sharding logic\n\n![Sharding logic depiction](resources/ApplicationLevelSharding.png)\n\n## Principle: Evenly distribute data irrespective of actual number of physical shards chosen by application owners.\n\nStrategy:\n\n* Pick a hashing algorithm with good distribution characteristics for most inputs.\n* Pick a large bucket size. Sharding key will be hashed to one of these buckets as a first step.\n* Using a large bucket size is expected to distribute keys fairly evenly. Buckets are grouped together to form intervals\n  with each interval mapping to a physical shard.\n  This bucket to shard mapping is created at application startup time.\n\n### Shard to bucket mapping\n\n* For purpose of evenly distributing data, a hashing algorithm is used to hash a key to one of a large number of\n  buckets.\n  Current bucket size is **1024**.\n\n* For Bucket count K, and Physical shards N, usually K \u003e\u003e N. Buckets are divided into intervals of size N/K and each\n  interval is mapped one-to-one to a physical shard.\n\n### Hashing algorithm for uniform sharding\n\n**Hashing.murmur3_128()** from Guava library is used which yields a 128 bit value corresponding to the hashing key.\nThis value is converted to integer to get bucket and in turn physical shard to which value for the key will be saved and\nretrieved.\n\n### What happens if an application owner decides to change the number of physical shards. For example from 16 to 32.\n\nResharding will be required to persist data to its new shard.\n\n## DAOs\n\nDaos are supposed to be objects to interact with datasource.\nDao classes in db sharding bundle are a layer above Hibernate. Daos interact with database via Hibernate session.\n\n## Types of DAOs supported\n\n### RelationalDao\n\n* A dao used to work with entities related to a parent shard. The parent may or maynot be physically present.\n* A murmur 128 hash of the string parent key is used to route the save and retrieve calls from the proper shard.\n\n### LookupDao\n\n* A dao to save and retrieve top level entities in the system.\n* Entity persisted using LookupDao needs to have exactly one field annotated with @LookupKey. This field will\n  be used as sharding key and hashed to right shard by same logic explained above.\n\n### CacheableLookupDao\n\n* A read-through/write-through wrapper over LookupDao.\n* It has a simple cache interface ```LookupCache``` with essential methods.\n* Any custom cache implementation can be used to implement this cache and initialize ```CacheableLookupDao``` eg.\n  Caffeine/Guava cache.\n\n### CacheableRelationalDao\n\n* A read-through/write-through wrapper over RelationalDao.\n* It has a simple cache interface ```RelationalCache``` with more methods compared to LookupCache.\n* Any custom cache implementation can be used to implement this cache and initialize ```CacheableRelationalDao``` eg.\n  Caffeine/Guava cache.\n\n### FAQs about DAOs\n\n#### If both RelationalDao and LookupDao use same logic of sharding based on a key, what is the difference between these two DAOs?\n\nWhen to use which one?\nThis can be slightly confusing to a beginner!\nRelationalDao is right choice for all entities that can be seen as children of same parent. While LookupDao\nis right choice for an entity that doesn't seem to have parent-child relationship or sibling relationship with any other\nentity.\n\nExample - If merchant is considered as parent entity, all entities such as merchant payment options,\nmerchant attributes, merchant preference info can be treated as children of this parent entity and persisted\nusing a RelationalDao. Using a relational dao indicates that these entities are related and should be colocated\non same shard to enable join/subselect queries etc.\n\nBut, another entity such as Agent table, which keeps info of agents who work on acquiring or helping multiple merchants,\nand are not related\nto any other entity in merchant database, may use a LookupDao, since it is a TOP-LEVEL entity in the system.\n\n#### What is the concept of parent key in RelationalDao?\n\nOne of the main requirements of sharding relational data is to colocate data for entities that might be\naccessed or retrived together. For example, a merchant's profile, her store information, her payment options might be\npulled together\nin some flows, so it should be located on the same shard for a merchant M.\nFor this purpose, shard containing merchant M's primary profile info can be considered as parent shard and merchant's Id\ncan be treated as parent key.\nThis parent key can be used for persisting related entities for the merchant using Relational Dao.\n\n#### Is it necessary for different entities to use same parent key?\n\nOnly for related entities, it makes sense to pick and use same parent key.\n\n#### Is sharding key specific to table or entire db?\n\nSharding key is required to shard an entity's data among different shards, it may have no relation in general\nwith any other entity. But, mostly entities are related to each other, so picking one sharding key/parent key\nto persist a number of related entities helps to keep code predictable and maintainable.\n\n#### What is alternative for persisting entity that is a root entity in itself and has no clear parent?\n\nLookupDao can be used for this purpose. Unlike RelationalDao, LookupDao has the concept of LookupKey.\nOne of the fields in the entity can be annotated with @LookupKey annotation to use this field for saving and retrieving\nentity. This field will be treated as hashing key with the Dao.\n\n## Shard blacklisting\n\nIt is possible sometimes that one or more shards go bad due to a hardware or connectivity issue.\nIn that situation, that shard can be blacklisted, to keep service operational while shard is fixed\nor shard endpoint is changed.\n\n```ShardBlacklistingStore``` is an interface, which contains methods to:\n\n* blacklist a shard\n* unblacklist a shard\n* check if a shard is blacklisted\n\nSince this library bundle will be part of an application with multiple boxes, it will be\neasier to implement ```ShardBlacklistingStore``` as integration with a distributed cache or designated service which\nwill keep account of currently blacklisted shards for your backend service.\n\n## Features\n\n* Pagination support\n\n*       Hibernate framework has in-built support for pagination.\n        DBShardingBundle has wrapper methods which internally call pagination-supporting hibernate apis such as list(Criteria)\n        Example - public List\u003cT\u003e select(String parentKey, DetachedCriteria criteria, int first, int numResults) throws Exception;\n\n* Updating multiple rows at once - TBD\n\nPlease refer test classes to understand sample usage of daos.\n\n## Usage\n\nThe project dependencies are:\n\n```\n\u003cdependency\u003e\n    \u003cgroupId\u003eio.appform.dropwizard.sharding\u003c/groupId\u003e\n    \u003cartifactId\u003edb-sharding-bundle\u003c/artifactId\u003e\n    \u003cversion\u003e2.1.10-5\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\nSet the Number of Shards property by setting Java property while starting the application.\n\n### Example:\n\n- Initiating Sharding Bundle with namespace\n\n```\nBalancedDBShardingBundle\u003cConfiguration\u003e dbShardingBundle = new BalancedDBShardingBundle\u003cConfiguration\u003e(\n        \"your_namespace\", // This is optional\n        List.of(\"com.example.server.core.entities\")\n      ) {\n            @Override\n            protected ShardedHibernateFactory getConfig(Configuration appConfig) {\n                return appConfig.getShards();\n            }\n    };\n```\n\nWhile running your application, ensure to set `-Dyour_namespace.db.shards=32` property. By default `db.shards=2`\n\n# NOTE\n\n- Package and group id has changed from `io.dropwizard.sharding` to `io.appfrom.dropwizard.sharding` from 1.3.12-3.\n- static create* methods have been replaced with instance methods from 1.3.13-4\n- Java compatibility moved to 11 (-release 11) from 2.1.10-1 onwards\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsantanusinha%2Fdropwizard-db-sharding-bundle","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsantanusinha%2Fdropwizard-db-sharding-bundle","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsantanusinha%2Fdropwizard-db-sharding-bundle/lists"}