{"id":16689006,"url":"https://github.com/roll-w/cloudhub","last_synced_at":"2025-10-01T00:31:42.617Z","repository":{"id":148381437,"uuid":"608645621","full_name":"roll-w/cloudhub","owner":"roll-w","description":"A scalable distributed file system. ","archived":false,"fork":false,"pushed_at":"2024-03-15T13:21:04.000Z","size":10749,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-11-19T12:05:55.714Z","etag":null,"topics":["distributed-storage","distributed-systems","grpc","java","java-17","protobuf","spring-boot"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/roll-w.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-03-02T13:00:06.000Z","updated_at":"2024-03-16T08:17:57.000Z","dependencies_parsed_at":"2023-11-25T11:24:43.825Z","dependency_job_id":"a59dbbd8-3d50-4a0b-8d0c-83f8c01586e9","html_url":"https://github.com/roll-w/cloudhub","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roll-w%2Fcloudhub","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roll-w%2Fcloudhub/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roll-w%2Fcloudhub/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roll-w%2Fcloudhub/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/roll-w","download_url":"https://codeload.github.com/roll-w/cloudhub/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234802521,"owners_count":18889073,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["distributed-storage","distributed-systems","grpc","java","java-17","protobuf","spring-boot"],"created_at":"2024-10-12T15:46:03.526Z","updated_at":"2025-10-01T00:31:37.202Z","avatar_url":"https://github.com/roll-w.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Cloudhub\n\nA high available, scalable distributed file system.\n\n## Requirements\n\n- Java 17\n\nYou can clone this project and build it with Maven: `mvn clean package`.\n\n## Getting Started\n\nAfter you have built the project or already have packaged files, \nyou can run the `meta-server` and `file-server` with the following steps:\n\n1. First you should unpack the packaged file with like `tar -zxvf starter.tar.gz`;\n2. Start `meta-server` with `sbin/start-meta-server.sh`;\n3. Start `file-server` with `sbin/start-file-server.sh`. (No order required)\n\n\u003e Note: Before you start servers, you should also modify the environment variables \n\u003e in the scripts if you haven't set the `JAVA_HOME` environment variable.\n\nYou can run `sbin/start-meta-server.sh -h` or `sbin/start-file-server.sh -h`\nto get more information about the arguments.\n\n### Modify Configuration\n\nYou can modify the configuration in `resource/cloudhub.conf` before you pack the project or\nstart it locally for testing.\n\nAfter you have packed the project, you can modify the configuration in `conf/cloudhub.conf`.\n\n\u003e Note: For the configuration files loading order: \n\u003e \n\u003e Specified path by `--config` argument \u003e `conf/cloudhub.conf` \u003e \n\u003e `cloudhub.conf` file in current directory \u003e\n\u003e `resource/cloudhub.conf`.\n\n## API Usage\n\nRun the `mvn clean install` command to install the project to your local Maven repository.\n\nThen you can add the dependency to your project:\n\n```xml\n\u003cdependencies\u003e\n    \u003cdependency\u003e\n        \u003cgroupId\u003etech.rollw.cloudhub\u003c/groupId\u003e\n        \u003cartifactId\u003ecloudhub-client\u003c/artifactId\u003e\n        \u003cversion\u003e0.1.3\u003c/version\u003e\n    \u003c/dependency\u003e\n\u003c/dependencies\u003e\n```\n\n## Architecture\n\nCloudhub File System (CFS) is designed as a scalable distributed file system\nwith read-only and limited write operations.\n\nCFS aims to achieve BASE, sacrificing some consistency to ensure high availability of the system.\n\nFile operations after upload are limited to read and delete,\nso it has the advantage of storing static and long-term unchanged data.\n\n### Server Architecture\n\nCFS is designed as a master-slave architecture. \nA complete CFS deployment includes a metadata server (`meta-server`) cluster\nand a file server (`file-server`) cluster.\n\nThe metadata server implements the management of the file server cluster,\nthe management of the replicas, and the allocation of requests.\nIt is allowed to set up a backup metadata servers to achieve a certain degree of high availability.\n\nThe file server implements the storage of files in blocks.\n\n### File Storage Architecture\n\nFor file storage, CFS uses a key-value mapping method. \nThis makes it possible to quickly map files to specific file servers.\n\nCFS is designed for file storage, and all files are stored as they are,\nso the compression process for files needs to be completed before uploading.\n\nFiles are divided into a large number of blocks when stored,\nand the blocks are stored in different containers.\nBlocks are the basic unit of communication and synchronization between file servers.\n\nContainers are designed as data structures that can scale in size according \nto the number of blocks (currently not implemented, but the corresponding \ninterface is reserved).\nEach container will create a corresponding index, and the file location \ncan be quickly addressed through the index.\n\nFiles are distributed by hash values, which can disperse files well.\nUsually, there will be no performance degradation due to files gathering in\nthe same hot spot area.\n\n### Data Structure\n\nCFS is designed to support the storage of large files.\n\nBy organizing countless `block`s into `container`s,\nand then organizing them into `container group`s,\na `container group` can accommodate a large number of files.\n\nThe support for large files depends on the disk size of the current server,\nand the support for larger files has not been completed.\n\n\n### Availability\n\nCFS is designed to achieve the goal of being able to use the service partially in \nthe event of a failure, and to maintain the basic data storage and retrieval functions.\n\nWhen it is confirmed that one of the `file-server`s has failed (usually data corruption),\nthe `meta-server` will try to allocate the data of other replicas for repair.\n\n#### Heartbeat\n\nEvery `file-server` periodically sends heartbeat information to `meta-server`.\nWhen no heartbeat is received from the `file-server` for a period of time,\nit is considered that the server is down and marked as dead,\nand no requests will be sent or forwarded to them.\n\nThe `file-server` crash may cause the number of replicas of some files to decrease,\nso the `meta-server` needs to confirm which file replicas are lost and start \nthe synchronization process if necessary.\n\nAfter the `file-server` is marked as dead, the time delay for starting the \nsynchronization process is long (usually more than 10 minutes).\nThis is to prevent a large number of replication requests from being \nsent due to temporary loss caused by server status fluctuations or network\nstatus fluctuations, causing a network storm.\n\n#### Data Integrity\n\nData corruption during transmission is highly possible, such as disk failure or network failure.\n\nTo ensure that the data is always complete when it is obtained,\nthe `meta-server` always saves the hash value of the file for verification after uploading.\n\n### Consistency\n\nCFS always maintains a soft state, there is a data delay in the \nsynchronization of replicas between different file servers.\n\nThis is because in the CFS architecture design, the file is actually in a \nread-only state in the system, and the container is the basic unit of \nsynchronization when synchronizing replicas.\n\nAfter the container is synchronized, the file can remain in the available \nstate in this replica, regardless of whether the container changes later,\nso there is no need to keep the consistent state at all times.\n\n## Contributing\n\nYou can contribute to this project by submitting issues or pull requests.\n\nFor major changes, please open an issue first to\ndiscuss what you would like to change.\n\n## License\n\n```text\nCloudhub - A high available, scalable distributed file system.\nCopyright (C) 2022 Cloudhub\n\nThis program is free software; you can redistribute it and/or modify\nit under the terms of the GNU General Public License as published by\nthe Free Software Foundation; either version 2 of the License, or\n(at your option) any later version.\n\nThis program is distributed in the hope that it will be useful,\nbut WITHOUT ANY WARRANTY; without even the implied warranty of\nMERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\nGNU General Public License for more details.\n\nYou should have received a copy of the GNU General Public License along\nwith this program; if not, write to the Free Software Foundation, Inc.,\n51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froll-w%2Fcloudhub","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Froll-w%2Fcloudhub","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froll-w%2Fcloudhub/lists"}