Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alfredbaudisch/ecto_instashard
Dynamic Instagram-like PostgreSQL sharding with Ecto
https://github.com/alfredbaudisch/ecto_instashard
Last synced: 14 days ago
JSON representation
Dynamic Instagram-like PostgreSQL sharding with Ecto
- Host: GitHub
- URL: https://github.com/alfredbaudisch/ecto_instashard
- Owner: alfredbaudisch
- License: apache-2.0
- Created: 2016-10-17T00:09:12.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2024-04-14T15:46:45.000Z (7 months ago)
- Last Synced: 2024-05-21T04:54:10.083Z (6 months ago)
- Language: Elixir
- Homepage:
- Size: 72.3 KB
- Stars: 55
- Watchers: 4
- Forks: 4
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Ecto.InstaShard
[![Hex.pm](https://img.shields.io/hexpm/v/ecto_instashard.svg?style=flat-square)](https://hex.pm/packages/ecto_instashard)
[![Hex.pm](https://img.shields.io/hexpm/dt/ecto_instashard.svg?style=flat-square)](https://hex.pm/packages/ecto_instashard)> Dynamic Instagram-like PostgreSQL sharding with Ecto
This library provides PostgreSQL physical (database) and logical (PostgreSQL schemas) sharding following [Instagram's pattern](http://media.postgresql.org/sfpug/instagram_sfpug.pdf) ([alternative](https://archive.org/details/postgresql_sharding_instagram)).
Documentation: http://hexdocs.pm/ecto_instashard/
## Main Features
- Unlimited databases and logical shards (PostgreSQL schemas). Ecto Repository modules are generated dynamically – according to sharding configuration.
- Id hashing to choose the related shard.
- Functions to get the dynamic repository related to a hashed id.
- Extract the shard id from a given item id.
- Functions to run queries in the correct physical and logical shard for a given item id.
- Support to multiple sharded PostgreSQL clusters.
- Dynamic creation of sharded schemas, `next_id` functions and tables (by providing the SQL scripts).Your sharded id column must be based on Instagram's `next_id` function. Each logical shard must have its own function. See [Database Scripts and Migrations](#scripts) below to see how the functions and shard can be generated dynamically.
## Repositories
An Ecto repository is created dynamically for each physical database provided as `Ecto.InstaShard.Repositories.[ShardName].N`, where N >= 0, representing the physical database position.## How to Use Ecto.InstaShard in an Application
Along with the following documentation, you can also see example configuration and usage in the tests.
### Installation
Add `ecto_instashard` to your list of dependencies in `mix.exs`:```elixir
def deps do
[{:ecto_instashard, "~> 0.6"}]
end
```### Repository Configuration
Add the Ecto **config data** for the repositories that you want to load and connect to with your application into your project `#{env}.exs` files.You have to specify details about the physical databases in a `List` and then details about the logical shards. The databases list contains individual Ecto config keyword lists.
As an example, consider a Chat Application (`:chat_app`) that will shard chat messages:
```elixir
message_databases = [
[
adapter: Ecto.Adapters.Postgres,
username: "postgres",
password: "pass",
database: "messages_0",
hostname: "localhost",
pool_size: 5
],
[
adapter: Ecto.Adapters.Postgres,
username: "postgres",
password: "pass",
database: "messages_1",
hostname: "localhost",
pool_size: 5
]
]
```Then provide the physical databases config list and logical shard details to the application config key `:message_databases`:
```elixir
config :chat_app, :message_databases, [
databases: message_databases,
count: Enum.count(message_databases),
logical_shards: 2048,
mapping: 1024
]
```Sharded config keys in details:
- **databases**: list of physical databases.
- **count**: amount of physical databases.
- **logical_shards** amount of logical shards distributed across the physical databases (must be an even number).
- **mapping**: amount of logical shards per physical database (must be an even number).If you are only using sharded repositories, at the bottom of your config set:
```elixir
config :chat_app, ecto_repos: []
```If you also have normal Ecto repositories in your application, you can configure and use them as you always do, example:
```elixir
config :chat_app, ecto_repos: [ChatApp.Repositories.Accounts]
```You can also define multiple sharded tables and even separate sharded cluster configuration (i.e. a cluster sharding messages and another cluster sharding upload data).
### Shard Module
Create a module for each of the sharded clusters/table that you need. The module must use `Ecto.InstaShard.Sharding.Setup` passing the necessary configuration.Example:
```elixir
defmodule ChatApp.Shards.Messages do
use Ecto.InstaShard.Sharding.Setup, config: [
config_key: :message_databases,
app_name: Mix.Project.config[:app],
base_module_name: ChatApp.ShardedRepositories,
# [base_module].[name][physical shard], i.e. ChatApp.ShardedRepositories.Messages0
name: "Messages",
# the sharded table
table: "messages",
scripts: [:messages],
worker_name: ChatApp.Repositories.Messages,
supervisor_name: ChatApp.Repositories.MessagesSupervisor
]
end
```### Add the Shard dynamic Ecto Repository to your Supervisor tree
Include the dynamic supervisor related to your Shards into your Supervisor tree using `[ShardModule].include_repository_supervisor`.
```elixir
defmodule ChatApp do
use Applicationdef start(_type, _args) do
import Supervisor.Spec, warn: falsechildren = []
|> ChatApp.Shards.Messages.include_repository_supervisor
# Your other childrenopts = [strategy: :one_for_one, name: ChatApp.Supervisor]
Supervisor.start_link(children, opts)
end
end
```### Database Scripts and Migrations
InstaShard can create your sharded tables and dynamic PostgreSQL schemas if you provide the related SQL scripts. Each DDL command must be separated by two blank lines.Replacement strings:
- `$shard_pos$`: logical shard position.
- `$shard_name$`: logical shard name, example: `shard0`.
- `$shard$`: logical shard name followed by dot `.`, example: `shard0.`. Usage: `$shard$some_table_name`.Save the scripts in the folder `scripts` in your application and fill the list of SQL scripts as atoms in the Shard configuration, in the `scripts` key (as per previous example).
Examples of scripts (these scripts are also inside `test/scripts`):
```sql
CREATE SCHEMA $shard_name$;CREATE SEQUENCE $shard_name$.message_seq;
CREATE OR REPLACE FUNCTION $shard_name$.next_id(OUT result bigint) AS $$
DECLARE
our_epoch bigint := 1314220021721;
seq_id bigint;
now_millis bigint;
shard_id int := $shard_pos$;
max_shard_id bigint := 2048;
BEGIN
SELECT nextval('$shard_name$.message_seq') % max_shard_id INTO seq_id;
SELECT FLOOR(EXTRACT(EPOCH FROM clock_timestamp()) * 1000) INTO now_millis;
result := (now_millis - our_epoch) << 23;
result := result | (shard_id << 10);
result := result | (seq_id);
END;
$$ LANGUAGE PLPGSQL;CREATE TABLE $shard_name$.messages (
id bigint not null default $shard_name$.next_id(),
user_id int not null,
message text NOT NULL,
inserted_at timestamp with time zone default now() not null,
PRIMARY KEY(id)
);
```To run the scripts and create the shards, call `[ShardModule].check_tables_exists`.
Example:
```elixir
ChatApp.Shards.Messages.check_tables_exists
```You can create a Mix task to run the command as a migration. Example:
```elixir
# lib/mix/tasks/chat_app.message_tables.ex
defmodule Mix.Tasks.ChatApp.MessageTables do
use Mix.Taskdef run(_) do
Mix.Task.run "app.start", []
ChatApp.Shards.Messages.check_tables_exists
end
end
```Run as `mix chat_app.message_tables`.
### Modules Usage
Helper functions are provided to allow consistent access across the physical and logical shards, by hashing the `user_id` owner of the operation/owner of the data. You must not manually call the sharded repository modules, i.e. `Ecto.InstaShard.Repositories.Messages1.insert(...)`, instead, the correct repository and shard is decided by helper functions that hash the `user_id` related to the operation.#### Get the Ecto Repository Module for a User Id
Call `[ShardModule].repository(user_id)` . Example:```elixir
repository = ChatApp.Shards.Messages.repository(message.user_id)
```#### Run include, update and delete for a User Id
Use the helper functions included in the modules `Ecto.InstaShard.Shards.[Shard]` to perform operations in the correct shard for a user id.- add_query_prefix/2 (to add the related user_id sharded PostgreSQL schema name to the table name)
- sharded_insert/3, sharded_select_query/3, sharded_query/2, sharded_query_all/2, sharded_query_one/2
- insert_all/3, insert_all/4 (supports schemaless changesets: `embedded_schema`)
- update_all/4, update_all/5, update_with_query/4
- delete_all/3, delete_all/4, delete_with_query/4
- get_all/4Examples:
```elixir
defmodule Message do
use Ecto.Schema
import Ecto.Changesetembedded_schema do
field :user_id, :integer
field :message, :map
end
end{_, [%{id: id}]} = ChatApp.Shards.Messages.sharded_insert(user.id, changeset.changes, returning: [:id])
ChatApp.Shards.Messages.update_all(user.id, [id: id],
[message: %{"content" => "foo"}]
)
ChatApp.Shards.Messages.delete_all(user.id, [id: id])
```#### Run raw sql queries for a User Id
Call `[ShardModule].run_sharded(user_id, sql)`.#### Get the shard name, shard number and table name for a User Id
Use the helper functions included in the shard module:- shard/1
- shard_name/1
- table/2, table/3## TODO
- [ ] Example project
- [ ] Helper functions documentation
- [ ] Helper functions in a separate macro
- [X] Support DDL in a single file instead of multiple files## License
Copyright 2016 Alfred Reinold Baudisch
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.