{"id":13454178,"url":"https://github.com/steemit/hivemind","last_synced_at":"2025-04-05T20:08:43.128Z","repository":{"id":47959284,"uuid":"84218934","full_name":"steemit/hivemind","owner":"steemit","description":"Developer-friendly microservice powering social networks on the Steem blockchain.","archived":false,"fork":false,"pushed_at":"2025-02-20T01:24:54.000Z","size":1608,"stargazers_count":73,"open_issues_count":46,"forks_count":66,"subscribers_count":30,"default_branch":"master","last_synced_at":"2025-03-29T19:07:34.488Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/steemit.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-03-07T16:02:59.000Z","updated_at":"2025-02-20T01:14:10.000Z","dependencies_parsed_at":"2023-01-21T05:45:22.574Z","dependency_job_id":"2171ad01-dc6c-4b23-8929-8bc5113487a6","html_url":"https://github.com/steemit/hivemind","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steemit%2Fhivemind","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steemit%2Fhivemind/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steemit%2Fhivemind/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/steemit%2Fhivemind/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/steemit","download_url":"https://codeload.github.com/steemit/hivemind/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247393570,"owners_count":20931813,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T08:00:51.630Z","updated_at":"2025-04-05T20:08:43.083Z","avatar_url":"https://github.com/steemit.png","language":"Python","funding_links":[],"categories":["Infrastructure"],"sub_categories":["Vote or Proxy a Witness"],"readme":"# Hivemind [BETA]\n\n#### Developer-friendly microservice powering social networks on the Steem blockchain.\n\nHive is a \"consensus interpretation\" layer for the Steem blockchain, maintaining the state of social features such as post feeds, follows, and communities. Written in Python, it synchronizes an SQL database with chain state, providing developers with a more flexible/extensible alternative to the raw steemd API.\n\n## Development Environment\n\n - Python 3.6 required\n - Postgres 10+ recommended\n\nDependencies:\n\n - OSX: `$ brew install python3 postgresql`\n - Ubuntu: `$ sudo apt-get install python3 python3-pip postgresql`\n\nInstallation:\n\nPlease note that you would need to create a user, set password, and assign a role in postgres before continue.\n\n```bash\n$ createdb hive\n$ export DATABASE_URL=postgresql://user:pass@localhost:5432/hive\n```\n\n```bash\n$ git clone https://github.com/steemit/hivemind.git\n$ cd hivemind\n$ pip3 install -e .[test]\n```\n\nStart the indexer:\n\n```bash\n$ hive sync\n```\n\n```bash\n$ hive status\n{'db_head_block': 19930833, 'db_head_time': '2018-02-16 21:37:36', 'db_head_age': 10}\n```\n\nStart the server:\n\n```bash\n$ hive server\n```\n\n```bash\n$ curl --data '{\"jsonrpc\":\"2.0\",\"id\":0,\"method\":\"hive.db_head_state\",\"params\":{}}' http://localhost:8080\n{\"jsonrpc\": \"2.0\", \"result\": {\"db_head_block\": 19930795, \"db_head_time\": \"2018-02-16 21:35:42\", \"db_head_age\": 10}, \"id\": 0}\n```\n\nRun tests:\n\n```bash\n$ make test\n```\n\n\n## Production Environment\n\nHivemind is deployed as a Docker container.\n\nHere is an example command that will initialize the DB schema and start the syncing process:\n\n```\ndocker run -d --name hivemind --env DATABASE_URL=postgresql://user:pass@hostname:5432/databasename --env STEEMD_URL=https://yoursteemnode --env SYNC_SERVICE=1 -p 8080:8080 steemit/hivemind:latest\n```\n\nBe sure to set `DATABASE_URL` to point to your postgres database and `STEEMD_URL` to point to your steemd node to sync from.\n\nOnce the database is synced, Hivemind will be available for serving requests.\n\nTo follow along the logs, use this:\n\n```\ndocker logs -f hivemind\n```\n\n\n## Configuration\n\n| Environment              | CLI argument         | Default |\n| ------------------------ | -------------------- | ------- |\n| `LOG_LEVEL`              | `--log-level`        | INFO    |\n| `HTTP_SERVER_PORT`       | `--http-server-port` | 8080    |\n| `DATABASE_URL`           | `--database-url`     | postgresql://user:pass@localhost:5432/hive |\n| `STEEMD_URL`             | `--steemd-url`       | https://api.steemit.com |\n| `REDIS_URL`              | `--redis-url`        | redis://localhost:6379/ |\n| `MAX_BATCH`              | `--max-batch`        | 50      |\n| `MAX_WORKERS`            | `--max-workers`      | 4       |\n| `TRAIL_BLOCKS`           | `--trail-blocks`     | 2       |\n| `RECOMMEND_COMMUNITIES`  | `--recommend-communities` | hive-108451,hive-172186,hive-187187   |\n\nPrecedence: CLI over ENV over hive.conf. Check `hive --help` for details.\n\n\n## Requirements\n\n\n\n### Hardware\n\n - Focus on Postgres performance\n - 2.5GB of memory for `hive sync` process\n - 250GB storage for database\n\n\n### Steem config\n\nBuild flags\n\n - `LOW_MEMORY_NODE=OFF` - need post content\n - `CLEAR_VOTES=OFF` - need all vote data\n - `SKIP_BY_TX=ON` - tx lookup not used\n\nPlugins\n\n - Required: `reputation reputation_api database_api condenser_api block_api`\n - Not required: `follow*`, `tags*`, `market_history`, `account_history`, `witness`\n\n\n### Postgres Performance\n\nFor a system with 16G of memory, here's a good start:\n\n```\neffective_cache_size = 12GB # 50-75% of avail memory\nmaintenance_work_mem = 2GB\nrandom_page_cost = 1.0      # assuming SSD storage\nshared_buffers = 4GB        # 25% of memory\nwork_mem = 512MB\nsynchronous_commit = off\ncheckpoint_completion_target = 0.9\ncheckpoint_timeout = 30min\nmax_wal_size = 4GB\n```\n\n## JSON-RPC API\n\nThe minimum viable API is to remove the requirement for the `follow` and `tags` plugins (now rolled into [`condenser_api`](https://github.com/steemit/steem/blob/master/libraries/plugins/apis/condenser_api/condenser_api.cpp)) from the backend node while still being able to power condenser's non-wallet features. Thus, this is the core API set:\n\n```\ncondenser_api.get_followers\ncondenser_api.get_following\ncondenser_api.get_followers_by_page\ncondenser_api.get_following_by_page\ncondenser_api.get_follow_count\n\ncondenser_api.get_content\ncondenser_api.get_content_replies\n\ncondenser_api.get_state\n\ncondenser_api.get_trending_tags\n\ncondenser_api.get_discussions_by_trending\ncondenser_api.get_discussions_by_hot\ncondenser_api.get_discussions_by_promoted\ncondenser_api.get_discussions_by_created\n\ncondenser_api.get_discussions_by_blog\ncondenser_api.get_discussions_by_feed\ncondenser_api.get_discussions_by_comments\ncondenser_api.get_replies_by_last_update\n\ncondenser_api.get_blog\ncondenser_api.get_blog_entries\ncondenser_api.get_discussions_by_author_before_date\n```\n\n\n## Overview\n\n\n#### History\n\nInitially, the [steemit.com](https://steemit.com) app was powered exclusively by `steemd` nodes. It was purely a client-side app without *any* backend other than a public and permissionless API node. As powerful as this model is, there are two issues: (a) maintaining UI-specific indices/APIs becomes expensive when tightly coupled to critical consensus nodes; and (b) frontend developers must be able to iterate quickly and access data in flexible and creative ways without writing C++.\n\nTo relieve backend and frontend pressure, non-consensus and frontend-oriented concerns can be decoupled from `steemd` itself. This (a) allows the consensus node to focus on scalability and reliability, and (b) allows the frontend to maintain its own state layer, allowing for flexibility not feasible otherwise.\n\nSpecifically, the goal is to completely remove the `follow` and `tags` plugins, as well as `get_state` from the backend node itself, and re-implement them in `hive`. In doing so, we form the foundational infrastructure on which to implement communities and more.\n\n#### Purpose\n\n##### Hive tracks posts, relationships, social actions, custom operations, and derived states.\n\n - *discussions:* by blog, trending, hot, created, etc\n - *communities:* mod roles/actions, members, feeds (in 1.5; [spec](https://github.com/steemit/hivemind/blob/master/docs/communities.md))\n - *accounts:* normalized profile data, reputation\n - *feeds:* un/follows and un/reblogs\n\n##### Hive does not track most blockchain operations.\n\nFor anything to do with wallets, orders, escrow, keys, recovery, or account history, query SBDS or steemd.\n\n##### Hive can be extended or leveraged to create:\n\n - reactions, bookmarks\n - comment on reblogs\n - indexing custom profile data\n - reorganize old posts (categorize, filter, hide/show)\n - voting/polls (democratic or burn/send to vote)\n - modlists: (e.g. spammy, abuse, badtaste)\n - crowdsourced metadata\n - mentions indexing\n - full-text search\n - follow lists\n - bot tracking\n - mini-games\n - community bots\n\n#### Core indexer\n\nIngests blocks sequentially, processing operations relevant to accounts, post creations/edits/deletes, and custom_json ops for follows, reblogs, and communities. From these we build account and post lookup tables, follow/reblog state, and communities/members data. Built exclusively from raw blocks, it becomes the ground truth for internal state. Hive does not reimplement logic required for deriving payout values, reputation, and other statistics which are much more easily attained from steemd itself in the cache layer.\n\n#### Cache layer\n\nSynchronizes the latest state of posts and users, allowing us to serve discussions and lists of posts with all expected information (title, preview, image, payout, votes, etc) without needing `steemd`. This layer is first built once the initial core indexing is complete. Incoming blocks trigger cache updates (including recalculation of trending score) for any posts referenced in `comment` or `vote` operations. There is a sweep to paid out posts to ensure they are updated in full with their final state.\n\n#### API layer\n\nPerforms queries against the core and cache tables, merging them into a response in such a way that the frontend will not need to perform any additional calls to `steemd` itself. The initial API simply mimics steemd's `condenser_api` for backwards compatibility, but will be extended to leverage new opportunities and simplify application development.\n\n\n#### Fork Resolution\n\n**Latency vs. consistency vs. complexity**\n\nThe easiest way to avoid forks is to only index up to the last irreversible block, but the delay is too much where users expect quick feedback, e.g. votes and live discussions. We can apply the following approach:\n\n1. Follow the chain as closely to `head_block` as possible\n2. Indexer trails a few blocks behind, by no more than 6s - 9s\n3. If missed blocks detected, back off from `head_block`\n4. Database constraints on block linking to detect failure asap\n5. If a fork is encountered between `hive_head` and `steem_head`, trivial recovery\n6. Otherwise, pop blocks until in sync. Inconsistent state possible but rare for `TRAIL_BLOCKS \u003e 1`.\n7. A separate service with a greater follow distance creates periodic snapshots\n\n\n## Documentation\n\n```bash\n$ make docs \u0026\u0026 open docs/hive/index.html\n```\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsteemit%2Fhivemind","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsteemit%2Fhivemind","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsteemit%2Fhivemind/lists"}