{"id":13448903,"url":"https://github.com/jervisfm/resqlite","last_synced_at":"2025-03-22T18:32:10.293Z","repository":{"id":72037985,"uuid":"110390545","full_name":"jervisfm/resqlite","owner":"jervisfm","description":"Replicated Sqlite Database built upon of the RAFT distributed consensus protocol","archived":false,"fork":false,"pushed_at":"2017-12-13T19:55:11.000Z","size":8921,"stargazers_count":10,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-14T21:55:00.609Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jervisfm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2017-11-12T00:03:31.000Z","updated_at":"2025-01-04T22:05:27.000Z","dependencies_parsed_at":null,"dependency_job_id":"1aa0081c-ac8d-4d98-ad1c-a620bb8bd558","html_url":"https://github.com/jervisfm/resqlite","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jervisfm%2Fresqlite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jervisfm%2Fresqlite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jervisfm%2Fresqlite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jervisfm%2Fresqlite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jervisfm","download_url":"https://codeload.github.com/jervisfm/resqlite/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245002885,"owners_count":20545511,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T06:00:24.141Z","updated_at":"2025-03-22T18:32:05.270Z","avatar_url":"https://github.com/jervisfm.png","language":"Go","readme":"# ReSqlite: Replicated Sqlite\n\n## Introduction\nReSqlite is an extension of Sqlite that aims to add basic replication functionality to Sqlite database.\n\n\n## Collaborators\n* Henri\n* Jervis\n\n## Overview\nThe goal of this project is to add additional resiliency to Sqlite by supporting database replication\non multiple nodes in a consistent manner. \n\nAt the high level the project aims to implement a basic version of the RAFT protocol and leverage that \nprotocol to add replication to Sqlite. \n\n## Raft Implementation\n\nFor educational purposes, we will implement our own RAFT protocol. However, given the timing constraints, \nthe RAFT implementation will be simple in nature. The key functionality that we want is reliable log replication. \nNotably, we do not intend to support viewchange support in the v0 and the clusters we'd be working with are \nassumed to be fixed in size. \n\n## Sqlite Replication\nThe replication of the Sqlite is going to be done at the statement level. That is, we will be replicating \nraw SQL statements and using that to ensure all the nodes arrive at the same end state. Note that this means\nSQL statements need to be deterministic and cannot make use of non-deterministic functions such as random().\n\n## Project Setup / Dependencies\nFor implementation, we make use of Go programming language. Please visit https://golang.org/dl/ to get the lastest\nversion of Goland. This project was created with Go lang 1.9\n\nAlso Install dep tool for Golang dependency management.\n\n```\n$ go get -u github.com/golang/dep/cmd/dep\n```\n\nFor RPC setup, we make use of the gRPC library from Google.\n\nA prerequiste is that you would need to get Protocol Buffers compiler setup. Grab one for your platform from https://github.com/google/protobuf/releases\nFor OS X, we used https://github.com/google/protobuf/releases/download/v3.4.0/protoc-3.4.0-osx-x86_64.zip. Add protoc to your\nenvironment $PATH.\n\nInstall the Protocol Buffer compiler plugin for golang:\n```\n$ go get -u github.com/golang/protobuf/protoc-gen-go\n```\n\nThis puts the tool under golang/bin so make sure that's part for your $PATH.\n```\n$ export PATH=$PATH:$GOPATH/bin\n```\n\nYou would also need an implementation of golang sqlite3 drvier. The one we used is go-sqlite3 - https://github.com/mattn/go-sqlite3\n\nInstall it like so on OS X:\n```\n$ brew install sqlite3\n$ go get github.com/mattn/go-sqlite3\n$ go install github.com/mattn/go-sqlite3\n```\n\n### Raft Service\nOur simplify ease of testing, our cluster  would consist of 3 nodes such that we can tolerate any single node failture.\n\nFrom the Raft Paper, we need to implement two RPCS:\n* RequestVotes\n* AppendEntries\n\nWe refer the reader to published page 308 of the RAFT paper (http://www.scs.stanford.edu/17au-cs244b/sched/readings/raft.pdf)\nwhich summarizes the semantics of these RPCs as well as what state (both persistent and volatile) needs to be maintained on each node.\n\nWe will implement these two RPCs using Google gRPCs under a RaftService declaration.\n\n### ReSqlite Service\nWe will also have another service for interacting with Sqlite. The idea of this service is to take a SQL command as a request to\nbe executed on a sqlite instance and then return the result of that execution back together. \n\nThe motivation for decoupling the Sqlite service from the Raft service is have clean separation of responsibilities between the\nservices. The Raft protocol does not care per-se what the contents of the replicated log are.\n\n### Implementation Plan. \nThe following is a proposed implementation structure for the project:\n\n* raft/raft.go: Implements main raft replication business logic as a library\n    - To keep logic simple, we would use an event-based approach for the main loop.\n* raft-cli/main.go (server implementation binary that can act as a basic raft node)\n    - To test the raft implementation we can start from scractch and use a constant set of data to replicate.\n\n* server/resqlite.go (Library that leverages Raft to replicate sqlite statements and implement replicated sqlite service)\n* server/main.go (server binary that nodes runs). This involves:\n    - Replicate new entry to other nodes.\n    - Add it locally on success and execute\n    - Reply to client with result.\n\n* client/resqlite.go (library)\n* client/main.go (client binary) \u003c- Perhaps this can be replicaed with polyglot (https://github.com/grpc-ecosystem/polyglot#server-reflection)\n\n### Pending Work items:\nMain item left is implementing cluster replication. To get there we need to:\n\n* Add methods to accessing raft persistent state. \n    - Have it backup to in memory state to start: DONE.\n    - Migrate it later to real durable storage later: DONE\n* Add a method to raft service to receive client command: DONE\n* Add sqlite dependency: DONE\n* Implement AppendEntries receiver.\n* Restore replicated state machine upon server start app.\n* Apply sql command.\n\n\nPerformance work:\n* Benchmark single node performance (w/o replication overhead)\n* Benchmark replicated cluster performance (3 nodes).\n\n### Testing\n\n#### Polyglot RPC Testing\n\nUsing polygot for rpc testing: https://github.com/grpc-ecosystem/polyglot/releases/tag/v1.5.0\nCreate test students table and add a value to it.\n```\n$ echo \"{ command: 'create table if not exists students(id integer primary key not null, name text); insert or replace into students(id, name) values(1, \\\"John\\\")' }\" | java -jar ~/bin/polyglot.jar  --command=call --endpoint=localhost:50050 --full_method=proto_raft.Raft/ClientCommand\n```\n\nRPC command to query it\n```\n$ $ echo \"{ query: 'select * from students' }\" | java -jar ~/bin/polyglot.jar  --command=call --endpoint=localhost:50050 --full_method=proto_raft.Raft/ClientCommand\n```\n\n#### REPL based testing\n\nWhile our repl is still very much a work in progress, it can be used to test manual commands\nand responds to a subset of sqlite3 CLI syntax, with all non-deterministic operations removed\n(Transactions, Random, Now, etc).\n\nLaunch the repl from the resqlite folder with:\n```\ngo run main.go\n```\n\n\nThe repl can be used in batch mode to pre-load a number of commands from a file before\nawaiting user input:\n```\n go run main.go -batch='../data/chinook.txt' -interactive=true\n```\n\n### Benchmarking\n\nWe use our repl to benchmark performance of our resqlite implementation against sqlite3's cli\n(we make the large assumption that CLI overhead is the same and the results will be\nrepresentative of underlying db performance).\n\nNote: this is not quite apples to apples yet given that we are using an in-memory db for\nReSqlite; however there are disk IOs since the logs are stored on disk.\n\nWe do so by launching the repl in non-interactive mode:\n```\ngo run main.go -batch='../data/chinook.txt'\n```\n\nWe chose four tests to run our benchmarking on:\n\n         File                   |        Description        \n    data/benchmarks/wOnly.db       ~15,000 consecutive writes\n    data/benchmarks/rOnly.db       ~15,000 consecutive reads\n    data/benchmarks/wHeavy.db      ~1,500 writes, 200 reads, repeated 3 times.\n    data/benchmarks/rHeavy.db      200 writes, ~1,500 reads, repeated 3 times.\n\nRunning these tests on a single-node cluster (taking median of 3 runs), we find:\n\n    Test         |      DB      |                    Time\n    wOnly            ReSqlite          2.52s user 1.58s system 51% cpu 8.019 total\n    wOnly            Sqlite3           0.21s user 0.01s system 95% cpu 0.227 total\n    rOnly            ReSqlite          3.82s user 3.19s system 24% cpu 28.076 total\n    rOnly            Sqlite3           15.92s user 5.88s system 23% cpu 1:34.64 total (of which 1:22:25 from cat)\n    wHeavy           ReSqlite          1.45s user 0.85s system 55% cpu 4.136 total\n    wHeavy           Sqlite3           0.51s user 0.22s system 19% cpu 3.675 total\n    rHeavy           ReSqlite          1.41s user 0.81s system 51% cpu 4.341 total\n    rHeavy           Sqlite3           0.79s user 0.33s system 19% cpu 5.792 total\n\nNote: wOnly must be run before rOnly.\n\n\n### Debugging Issues\n\n* Leader election\n    - Yay, seems to be working now.\n    - Want to fix though burning CPU on follower loop: Fixed\n    - Need to look a go routine leak. We die when running with -race detector: FIXED\n\n* Databases\n    - File paths do not seem to be created properly: FIXED\n         - Just execute a db statement to get the file to be created.\n\n* Client Command not being processed: FIXED\nIssue was that we're not handling case of an unexpected request format.\n```\necho \"{  }\" | java -jar ~/bin/polyglot.jar  --command=call --endpoint=localhost:50050 --full_method=proto_raft.Raft/ClientCommand\n```","funding_links":[],"categories":["Relational Databases","distribute"],"sub_categories":["Raft"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjervisfm%2Fresqlite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjervisfm%2Fresqlite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjervisfm%2Fresqlite/lists"}