{"id":25616647,"url":"https://github.com/twitter/flockdb","last_synced_at":"2025-02-22T04:01:20.804Z","repository":{"id":867588,"uuid":"605999","full_name":"twitter-archive/flockdb","owner":"twitter-archive","description":"A distributed, fault-tolerant graph database","archived":true,"fork":false,"pushed_at":"2017-03-16T23:11:18.000Z","size":5392,"stargazers_count":3337,"open_issues_count":27,"forks_count":257,"subscribers_count":277,"default_branch":"master","last_synced_at":"2025-02-09T19:01:46.779Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/twitter-archive.png","metadata":{"files":{"readme":"README.markdown","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2010-04-12T03:53:45.000Z","updated_at":"2025-02-08T04:49:34.000Z","dependencies_parsed_at":"2022-08-16T11:15:17.802Z","dependency_job_id":null,"html_url":"https://github.com/twitter-archive/flockdb","commit_stats":null,"previous_names":["twitter/flockdb"],"tags_count":21,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twitter-archive%2Fflockdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twitter-archive%2Fflockdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twitter-archive%2Fflockdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/twitter-archive%2Fflockdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/twitter-archive","download_url":"https://codeload.github.com/twitter-archive/flockdb/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240122591,"owners_count":19751143,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-22T04:01:13.223Z","updated_at":"2025-02-22T04:01:20.799Z","avatar_url":"https://github.com/twitter-archive.png","language":"Scala","readme":"# STATUS\n\nTwitter is no longer maintaining this project or responding to issues or PRs.\n\n# FlockDB\n\nFlockDB is a distributed graph database for storing adjancency lists, with\ngoals of supporting:\n\n- a high rate of add/update/remove operations\n- potientially complex set arithmetic queries\n- paging through query result sets containing millions of entries\n- ability to \"archive\" and later restore archived edges\n- horizontal scaling including replication\n- online data migration\n\nNon-goals include:\n\n- multi-hop queries (or graph-walking queries)\n- automatic shard migrations\n\nFlockDB is much simpler than other graph databases such as neo4j because it\ntries to solve fewer problems. It scales horizontally and is designed for\non-line, low-latency, high throughput environments such as web-sites.\n\nTwitter uses FlockDB to store social graphs (who follows whom, who blocks\nwhom) and secondary indices. As of April 2010, the Twitter FlockDB cluster\nstores 13+ billion edges and sustains peak traffic of 20k writes/second and\n100k reads/second.\n\n\n# It does what?\n\nIf, for example, you're storing a social graph (user A follows user B), and\nit's not necessarily symmetrical (A can follow B without B following A), then\nFlockDB can store that relationship as an edge: node A points to node B. It\nstores this edge with a sort position, and in both directions, so that it can\nanswer the question \"Who follows A?\" as well as \"Whom is A following?\"\n\nThis is called a directed graph. (Technically, FlockDB stores the adjacency\nlists of a directed graph.) Each edge has a 64-bit source ID, a 64-bit\ndestination ID, a state (normal, removed, archived), and a 32-bit position\nused for sorting. The edges are stored in both a forward and backward\ndirection, meaning that an edge can be queried based on either the source or\ndestination ID.\n\nFor example, if node 134 points to node 90, and its sort position is 5, then\nthere are two rows written into the backing store:\n\n    forward: 134 -\u003e 90 at position 5\n    backward: 90 \u003c- 134 at position 5\n\nIf you're storing a social graph, the graph might be called \"following\", and\nyou might use the current time as the position, so that a listing of followers\nis in recency order. In that case, if user 134 is Nick, and user 90 is Robey,\nthen FlockDB can store:\n\n    forward: Nick follows Robey at 9:54 today\n    backward: Robey is followed by Nick at 9:54 today\n\nThe (source, destination) must be unique: only one edge can point from node A\nto node B, but the position and state may be modified at any time. Position is\nused only for sorting the results of queries, and state is used to mark edges\nthat have been removed or archived (placed into cold sleep).\n\n\n# Building\n\nIn theory, building is as simple as\n\n    $ sbt clean update package-dist\n\nbut there are some pre-requisites. You need:\n\n- java 1.6\n- sbt 0.7.4\n- thrift 0.5.0\n\nIf you haven't used sbt before, this page has a quick setup:\n[http://code.google.com/p/simple-build-tool/wiki/Setup](http://code.google.com/p/simple-build-tool/wiki/Setup).\nMy `~/bin/sbt` looks like this:\n\n    #!/bin/bash\n    java -server -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256m -Xmx1024m -jar `dirname $0`/sbt-launch-0.7.4.jar \"$@\"\n\nApache Thrift 0.5.0 is pre-requisite for building java stubs of the thrift\nIDL. It can't be installed via jar, so you'll need to install it separately\nbefore you build. It can be found on the apache thrift site:\n[http://thrift.apache.org/](http://thrift.apache.org/).\nYou can find the download for 0.5.0 here: \n[http://archive.apache.org/dist/incubator/thrift/0.5.0-incubating/](http://archive.apache.org/dist/incubator/thrift/0.5.0-incubating/).\n\nIn addition, the tests require a local mysql instance to be running, and for\n`DB_USERNAME` and `DB_PASSWORD` env vars to contain login info for it. You can\nskip the tests if you want (but you should feel a pang of guilt):\n\n    $ NO_TESTS=1 sbt package-dist\n\n\n# Running\n\nCheck out\n[the demo](http://github.com/twitter/flockdb/blob/master/doc/demo.markdown)\nfor instructions on how to start up a local development instance of FlockDB.\nIt also shows how to add edges, query them, etc.\n\n\n# Community\n\n- Twitter: #flockdb\n- IRC: #twinfra on freenode (irc.freenode.net)\n- Mailing list: \u003cflockdb@googlegroups.com\u003e [subscribe](http://groups.google.com/group/flockdb)\n\n\n# Contributors\n\n- Nick Kallen @nk\n- Robey Pointer @robey\n- John Kalucki @jkalucki\n- Ed Ceaser @asdf\n","funding_links":[],"categories":["NoSQL","NOSQL","Scala","Graph Data Model"],"sub_categories":["Error Monitoring"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftwitter%2Fflockdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftwitter%2Fflockdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftwitter%2Fflockdb/lists"}