{"id":19448364,"url":"https://github.com/carv-ics-forth/tebis","last_synced_at":"2025-06-14T23:38:38.757Z","repository":{"id":74033001,"uuid":"454406360","full_name":"CARV-ICS-FORTH/tebis","owner":"CARV-ICS-FORTH","description":"An efficient distributed key value store for fast storage devices and RDMA networks","archived":false,"fork":false,"pushed_at":"2024-07-25T13:10:06.000Z","size":18634,"stargazers_count":15,"open_issues_count":0,"forks_count":0,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-07-26T11:43:39.781Z","etag":null,"topics":["distributed","network","storage"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/CARV-ICS-FORTH.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-02-01T13:52:12.000Z","updated_at":"2024-07-25T13:10:14.000Z","dependencies_parsed_at":"2024-06-26T16:01:05.514Z","dependency_job_id":"92a20376-e678-462d-909e-ad8f1d2b957e","html_url":"https://github.com/CARV-ICS-FORTH/tebis","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CARV-ICS-FORTH%2Ftebis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CARV-ICS-FORTH%2Ftebis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CARV-ICS-FORTH%2Ftebis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/CARV-ICS-FORTH%2Ftebis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/CARV-ICS-FORTH","download_url":"https://codeload.github.com/CARV-ICS-FORTH/tebis/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223978381,"owners_count":17235163,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed","network","storage"],"created_at":"2024-11-10T16:26:25.678Z","updated_at":"2024-11-10T16:26:26.290Z","avatar_url":"https://github.com/CARV-ICS-FORTH.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tebis\nTebis is a persistent LSM key value (KV) store desinged for fast storage devices and RDMA networks. Tebis uses two main technologies :\n 1.  Hybrid KV placement via its [Parallax](https://github.com/CARV-ICS-FORTH/parallax) LSM KV store to reduce I/O amplification and increase CPU efficiency.\n  2.  Index shipping via CPU efficient RDMA bulk network transfers to reduce the compaction overhead in replicas. Instead of repeating the compaction in replicas primary ships the index which replicas rewrite to be valid for their storage address space.\n\nMore details can be found in the Eurosys '22 paper [Tebis: Index Shipping for Efficient Replication in LSM Key-Value Stores](https://dl.acm.org/doi/abs/10.1145/3492321.3519572).\n\n# Project structure\n## The following folders contain\n- YCSB-CXX contains the C++ version of the YCSB benchmark along with a Tebis driver\n- tebis_rdma contains code rdma utilities used in the project\n- tebis_server contains all the server related code\n- tebis_rdma_client contains the client side code of Tebis\n- File  tebis_rdma_client/tebis_rdma_client.h  contains the public API of the client API\n- tcp_server contains the code for a standalone tcp_server over Parallax, it is a separate cmake project. Detailed information are in the tcp_server folder README.md\n- tcp_client contains the code for the TCP client\n\n# Building Tebis\n**Note: It has been tested with gcc 10.1.0**\n\n## Build Dependencies\n\nTo build Tebis, the following libraries have to be installed on your system:\n* `libnuma` - Allocations with NUMA policy\n* `libibverbs` - Infiniband verbs\n* `librdmacm` - RDMA Connection Manager\n* `libzookeeper_mt` - Zookeeper client bindings for C\n\nFor Mellanox cards, the Infiniband and RDMA libraries are included in the software package provided by the vendor.\nAdditionally, Tebis uses cmake for its build system and the gcc and g++ compilers for its compilation.\n\n### Installing Dependencies on Ubuntu 18.04 LTS\n\nTebis requires CMake version \u003e= 3.11.0. On Ubuntu, you need to add the\nfollowing repository to get the latest stable version of CMake:\n\n\twget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2\u003e/dev/null | sudo apt-key add -\n\tsudo apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main'\n\tsudo apt update\n\nRun the following command with superuser privileges:\n\n\tsudo apt install libnuma-dev libibverbs-dev librdmacm-dev libzookeeper-mt-dev\n\nFor the build tools and compiler:\n\n\tsudo apt install cmake build-essential\n\n### Installing Depedencies on Centos/RHEL 7\n\nTebis requires CMake version \u003e= 3.11.0. On Centos/RHEL this is supplied from the\nEPEL repository and can be installed with:\n\n\tsudo yum install cmake3 kernel-devel gcc-c++\n\n\n#### Dependencies for Single Node Tebis\n\n\tsudo yum install numactl-devel boost-devel\n\n\nFor RDMA:\n\n\tsudo yum install libibverbs-devel librdmacm-devel\n\nYou also need to install ZooKeeper. Ready-made packages are available from\nCloudera.\n1. Add Cloudera's Centos 7 repository as described\n[here](https://docs.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_ig_cdh5_install.html)\n2. Install the Zookeeper C binding for clients:\n\n\t    yum install zookeeper-native\n\n## Build Configuration\n\nCompilation is done using the clang compiler, provided by the clang package in\nmost Linux distributions. To configure the build system of Tebis and build it run\nthe commands:\n\n\tmkdir build\n\tcd build\n\tcmake .. -DTEBIS_FORMAT=ON\n\tmake\n\nOn Centos/RHEL 7, replace the `cmake` command with the `cmake3` command supplied\nfrom the EPEL package of the same name.\n\nIn the case of the standalone tcp-server you need to allocate a file for Parallax via the command\n\n\n`fallocate -l \u003csize in GB\u003eG \u003cfile name\u003e`\n\nThen, you need to initialize it with the kv_format tool of Parallax\n\n`\u003cBUILD_FOLDER\u003e_deps/parallax-build/lib/kv_format.parallax --device \u003cfile name\u003e --max_regions_num \u003cnumber of regions\u003e`\n\nFinally, execute\n\n`\u003cBUILD_FOLDER\u003e/tcp_server/tcp-server -t \u003cthreads num\u003e -b \u003cIP address\u003e -p \u003cport number\u003e -f \u003cfile name\u003e -L0 \u003cL0 size in MB\u003e -GF \u003cgrowth factor\u003e`\n\n\n\n## Build Configuration Parameters\n\nThe CMake scripts provided support two build configurations; \"Release\" and\n\"Debug\". The Debug configuration enables the \"-g\" option during compilation to\nallow debugging. The build configuration can be defined as a parameter to the\ncmake call as follows:\n\n\tcmake3 .. -DCMAKE_BUILD_TYPE=\"Debug|Release\"\n\nThe default build configuration is \"Debug\".\n\nThe \"Release\" build disables warnings and enables optimizations.\n\n## Build Targets\n* build/tebis_server/libtebis_client.a - Client library for applications to interact with Tebis\n* build/tebis_server/tebis_server - The executable of Tebis server\n* build/YCSB-CXXX/ycsb-async-tebis - YCSB that uses Tebis as its storage\n\n\n\n# Static Analyzer\n\nInstall the clang static analyzer with the command:\n\n\tsudo pip install scan-build\n\nBefore running the analyzer, make sure to delete any object files and\nexecutables from previous build by running in the root of the repository:\n\n\trm -r build\n\nThen generate a report using:\n\n\tscan-build --intercept-first make\n\nThe last line of the above command's output will mention the folder where the\nnewly created report resides in. For example:\n\n\t\"scan-build: Run 'scan-view /tmp/scan-build-2018-09-05-16-21-31-978968-9HK0UO'\n\tto examine bug reports.\"\n\nTo view the report you can run the above command, assuming you have a graphical\nenvironment or just copy the folder mentioned to a computer that does and open\nthe index.html file in that folder.\n\n\u003c!-- Development in cluster\n\nFor development in the cluster append the paths below in your PATH environment variable:\n\n\texport PATH=/archive/users/gxanth/llvm-project/build/bin:$PATH\n\texport PATH=/archive/users/gxanth/git/bin:$PATH\n\texport PATH=/archive/users/gxanth/gcc-9.1/bin:$PATH\n\texport LD_LIBRARY_PATH=/archive/users/gxanth/gcc-9.1/lib64:$LD_LIBRARY_PATH\n\texport PATH=$PATH:/archive/users/gxanth/go/bin\n\texport PATH=/archive/users/gxanth/shellcheck-stable:$PATH\n\texport PATH=$PATH:/archive/users/gxanth/go/bin\n\texport PATH=$PATH:$HOME/go/bin\n\texport CC=gcc-9.1\n\texport CXX=g++-9.1--\u003e\n\n\u003c!--# Install shfmt\n\nTo install shfmt run the command below in your shell:\n\n\tGO111MODULE=on go get mvdan.cc/sh/v3/cmd/shfmt\n\n\n# Generating compile_commands.json for Tebis\n\nInstall compdb for header awareness in compile_commands.json:\n\n\tpip install --user git+https://github.com/Sarcasm/compdb.git#egg=compdb\n\nAfter running cmake .. in the build directory run:\n\n\tcd ..\n\tcompdb -p build/ list \u003e compile_commands.json\n\tmv compile_commands.json build\n\tcd tebis\n\tln -sf ../build/compile_commands.json--\u003e\n\n# Pre commit hooks using pre-commit\n\n\u003c!--Add the path below to your PATH environment variable:\n\n    \tPATH=/archive/users/gxanth/git/bin:$PATH--\u003e\n\nTo install pre-commit:\n\n\tpip install pre-commit --user\n\tpre-commit --version\n\t2.2.0\n\nIf the above command does not print 2.2.0 you need to update python using:\n\n\tsudo yum update python3\n\nThen try upgrading pre-commit:\n\n\tpip install -U pre-commit --user\n\nTo install pre-commit hooks:\n\n\tcd tebis\n\tpre-commit install\n    pre-commit install --hook-type commit-msg\n\nIf everything worked as it should then the following message should be printed:\n\tpre-commit installed at .git/hooks/pre-commit\n\nIf you want to run a specific hook with a specific file run:\n\n\tpre-commit run hook-id --files filename\n\tpre-commit run cmake-format --files CMakeLists.txt\n\n## Running Tebis\nTebis uses RDMA for all network communication, which requires support from the\nnetwork interface to run. A software implementation (soft-RoCE) exists and can\nrun on all network interfaces.\n\n### Enabling soft-RoCE\nsoft-RoCE is part of the mainline Linux kernel versions since version 4.9\nthrough the `rdma_rxe` kernel module. To enable it for a network adapter the\nfollowing steps are required:\n\n#### 1. Install dependencies\nThe `ibverbs-utils` and `rdma-core` packages are required to enable soft-RoCE.\nThese packages should be in most distirbutions' repositories\n\n##### Installing Dependencies on Ubuntu 18.04 LTS\nRun the following command form a terminal with root access (or use sudo):\n```\n# apt install ibverbs-utils rdma-core perftest\n```\n\n##### Enable soft-RoCE\nTo enable soft-RoCE on a network command run the following commands with\nsuperuser privileges:\n```\nrxe_cfg start\nrxe_cfg add eth_interface\n```\nwhere `eth_interface` is the name of an ethernet network adapter interface. To\nview available network adapters run `ip a`.\n\nThe command `rxe_cfg start` has to be run at every boot to use RDMA features.\n\n##### Verify soft-RoCE is working\nTo verify that soft-RoCE is working, we can run a simple RDMA Write throuhgput\nbenchmark.\n\nFirst, open two shells, one to act as the server and one to act as the client.\nThen run the following commands:\n* On the server: `ib_write_bw`\n* On the client: `ib_write_bw eth_interface_ip`, where `eth_interface_ip` is\nthe IP address of a soft-RoCE enabled ethernet interface.\n\nExample output:\n* Server process:\n```\n************************************\n* Waiting for client to connect... *\n************************************\n---------------------------------------------------------------------------------------\n                    RDMA_Write BW Test\n Dual-port       : OFF\t\tDevice         : rxe0\n Number of qps   : 1\t\tTransport type : IB\n Connection type : RC\t\tUsing SRQ      : OFF\n CQ Moderation   : 100\n Mtu             : 1024[B]\n Link type       : Ethernet\n GID index       : 1\n Max inline data : 0[B]\n rdma_cm QPs\t : OFF\n Data ex. method : Ethernet\n---------------------------------------------------------------------------------------\n local address: LID 0000 QPN 0x0011 PSN 0x3341fd RKey 0x000204 VAddr 0x007f7e1b8fa000\n GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:122:205\n remote address: LID 0000 QPN 0x0012 PSN 0xbfbac5 RKey 0x000308 VAddr 0x007f70f5843000\n GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:122:205\n---------------------------------------------------------------------------------------\n #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]\n 65536      5000             847.44             827.84 \t\t   0.013245\n---------------------------------------------------------------------------------------\n```\n\n* Client process:\n```\n---------------------------------------------------------------------------------------\n                    RDMA_Write BW Test\n Dual-port       : OFF\t\tDevice         : rxe0\n Number of qps   : 1\t\tTransport type : IB\n Connection type : RC\t\tUsing SRQ      : OFF\n TX depth        : 128\n CQ Moderation   : 100\n Mtu             : 1024[B]\n Link type       : Ethernet\n GID index       : 1\n Max inline data : 0[B]\n rdma_cm QPs\t : OFF\n Data ex. method : Ethernet\n---------------------------------------------------------------------------------------\n local address: LID 0000 QPN 0x0012 PSN 0xbfbac5 RKey 0x000308 VAddr 0x007f70f5843000\n GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:122:205\n remote address: LID 0000 QPN 0x0011 PSN 0x3341fd RKey 0x000204 VAddr 0x007f7e1b8fa000\n GID: 00:00:00:00:00:00:00:00:00:00:255:255:192:168:122:205\n---------------------------------------------------------------------------------------\n #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]\n 65536      5000             847.44             827.84 \t\t   0.013245\n---------------------------------------------------------------------------------------\n```\n\nGit commit message template\n--------------------------------------------------------------------------------\n\n\tgit config commit.template .git-commit-template\n\n\n\n# Using cgroups to Limit Available Memory\nYou can use `cgroups` to limit the memory available to a process by running the\nprocess using `systemd-run`. In addition to memory allocations, pages in the Linux\nkernel's buffer cache count towards the `cgroups` memory limit. Example usage:\n```\n# systemd-run --unit=unit0 --scope --slice=slice0 --property MemoryLimit=16G \u003ccommand\u003e\n```\nStarting a `systemd` unit requires root privileges. The above example will limit\nthe memory available to a command (including pages in the buffer cache) to 16GB.\n\n# Running Tebis on a two server machine configuration\nFirst we need a Zookeeper server. For simplicity we assume that the Zookeeper service runs at zoo:2181. Then we\nneed to initialize Tebis metadata. This can be done through the command\n\u003ctebis_root_folder\u003e/scripts/tebis/tebis_zk_init.py \u003chosts_file\u003e \u003cregions_file\u003e \u003czookeeper_host\u003e\n\n**Hosts_file:** Contains the servers of the cluster in the form \u003chost1:port_for_incoming_rdma_connections:0\u003e \u003crole leader or empty\u003e\n\nExample:\u003cbr\u003e\nsith2.cluster.ics.forth.gr:8080:0 leader #(so sith2.cluster.ics.forth.gr:8080 will be the initial leader of the system)\u003cbr\u003e\nsith3.cluster.ics.forth.gr:8080:0\u003cbr\u003e\nsith6.cluster.ics.forth.gr:8080:0\n\n\n**Regions file:** Contains the region info in which we split the key space \u003cbr\u003e\n\u003cregion_id\u003e \u003cmin_key_range\u003e \u003cmax_key_range\u003e \u003cserver1:port:1 (primary)\u003e \u003cserver2:port:1 (backup)\u003e\u003cbr\u003e\n\nExample:\u003cbr\u003e\n0 -oo MM sith2.cluster.ics.forth.gr:8080:1 sith3.cluster.ics.forth.gr:8080:1\u003cbr\u003e\n1 MM  ZZ sith3.cluster.ics.forth.gr:8080:1 sith6.cluster.ics.forth.gr:8080:1\u003cbr\u003e\n2 ZZ +oo sith6.cluster.ics.forth.gr:8080:1 sith2.cluster.ics.forth.gr:8080:1\u003cbr\u003e\n\nIn each tebis server we need a allocated file where Tebis will store its data. Each server's storage capacity will be equal\nto the size of the file provided. The server will create its own file (or use dd or fallocate).\u003cbr\u003e\n*Example: `fallocate -l 100G /path/to/file`*\n\nThen we need to boot first the leader of the Tebis rack \u003cbr\u003e\n```\n\u003ctebis_build_root folder\u003e/tebis_server/tebis_server -d \u003cpath to tebis file\u003e -z \u003czk_host:zk_port\u003e -r \u003cRDMA IP subnet\u003e -p \u003cserver port\u003e -c \u003cnum of threads\u003e [-t \u003cLSM L0 size in keys\u003e] [-g \u003cgrowth factor\u003e] [-i \u003c\"send_index\" | \"build_index\"\u003e] [-s \u003cdevice size in GB\u003e]\n```\n\nexample: build/tebis_server/tebis_server -d /nvme/par1.dat -z sith2:2181 -r 192.168.4 -p 8080 -c 3 \u003cbr\u003e\nexample: build/tebis_server/tebis_server -d /nvme/par1.dat -z sith2:2181 -r 192.168.4 -p 8080 -c 3 -t 16 -g 10 -i send_index -s 100\n\n# Tests\ncd into folder `\u003cBUILD_ROOT_FOLDER\u003e/tests/` and type\ntest_krc_api zk_host:zk_port\n\n## Acknowledgements\nWe thankfully acknowledge the support of the European Commission under the Horizon 2020 Framework Programme for Research and Innovation through the projects EVOLVE (Grant Agreement ID: 825061). This work is (also) partly supported by project EUPEX, which has received funding from the European High-Performance Computing Joint Undertaking (JU) under grant agreement No 101033975. The JU receives support from the European Union's Horizon 2020 re-search and innovation programme and France, Germany, Italy, Greece, United Kingdom, Czech Republic, Croatia.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarv-ics-forth%2Ftebis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcarv-ics-forth%2Ftebis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarv-ics-forth%2Ftebis/lists"}