{"id":19167573,"url":"https://github.com/thomasweise/simplecluster","last_synced_at":"2026-05-18T14:02:53.400Z","repository":{"id":90948480,"uuid":"182232393","full_name":"thomasWeise/simpleCluster","owner":"thomasWeise","description":"A very simple implementation of a cluster scheduler.","archived":false,"fork":false,"pushed_at":"2019-05-09T02:13:23.000Z","size":86,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-01-03T23:12:06.090Z","etag":null,"topics":["clusters","distributed-computing","jobscheduler","linux","parallel-computing","scheduling","ubuntu"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thomasWeise.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-04-19T08:40:08.000Z","updated_at":"2019-05-09T02:13:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"66922bc7-2c43-45f7-9b9d-5bac70824e61","html_url":"https://github.com/thomasWeise/simpleCluster","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thomasWeise%2FsimpleCluster","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thomasWeise%2FsimpleCluster/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thomasWeise%2FsimpleCluster/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thomasWeise%2FsimpleCluster/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thomasWeise","download_url":"https://codeload.github.com/thomasWeise/simpleCluster/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240247619,"owners_count":19771342,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clusters","distributed-computing","jobscheduler","linux","parallel-computing","scheduling","ubuntu"],"created_at":"2024-11-09T09:38:20.807Z","updated_at":"2026-05-18T14:02:48.365Z","avatar_url":"https://github.com/thomasWeise.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# simpleCluster\n\n[\u003cimg alt=\"Travis CI Build Status\" src=\"http://img.shields.io/travis/thomasWeise/simpleCluster/master.svg\" height=\"20\"/\u003e](http://travis-ci.org/thomasWeise/simpleCluster/)\n\n## 1. Introduction\n\nThis is a very trivial job scheduling engine for small clusters.\nHere, using some [sophisticated systems](https://en.wikipedia.org/wiki/Comparison_of_cluster_software) like [TORQUE](https://en.wikipedia.org/wiki/TORQUE_Resource_Manager) or [SLURM](https://en.wikipedia.org/wiki/Slurm_Workload_Manager) is just too complicated, as they require a lot of installation and configuration effort.\nInstead, we aim to develop a very simple system for distributing jobs that does not require any installation (except for Java) and has one single entry point for job submission and execution.\nWe also do not care about rights management or systems security, as this system is strictly for personal use.\n\nThe idea is that all worker computers mount a shared directory under exactly the same path.\nYou should also mount this directory on your own PC.\nInside this shared directory, we put the binaries and data of all jobs to be executed.\nFor each job, you could create a directory, put the binary and data and a shell script for executing the job inside.\nThere also is a job queue file to which jobs can get appended and these are then executed by the workers.\nNo rights management, no ssh, no nothing spectacular.\nJust a simple jar archive for both submitting jobs to the queue and for launching jobs in worker threads.\nEach job is processed by a single worker thread.\nNo management of parallelism except for that is done, i.e., jobs may be programs that can spawn arbitrarily many own threads.\nHowever, you can tag a job as `blocksMachine` meaning that no other job can be executed in parallel on the same machine.\n\nThere is no central scheduling.\nInstead, the workers will query the job queue file for new tasks.\nVia a lock file, it is ensured that only one worker can read the queue file at once.\nFor each worker PC, at most one idle thread will be querying the central job queue file.\nSince there is no central scheduling, we do not need any cluster management software or resource management software.\n\nThe whole cluster management software (if we want to call it that) is super-small, just 22 KiB in size.\nDon't expect any high performance or great scalability.\nI don't care about that.\nI just want a simple distributed job executor that only needs a minimum software installation (i.e., Java).\nIt uses a simple text file and a lock file for queue management, so if you have more than 10 or so worker PCs and more than 1000 or so jobs at once, expect a significant performance decrease.\n\nFor any more sophisticated cluster usage, e.g., for one that will work with more nodes or has user management or other fancy features, please check [this list](https://en.wikipedia.org/wiki/Comparison_of_cluster_software).\n\nThis system has only been tested under [Ubuntu](https://en.wikipedia.org/wiki/Ubuntu) [Linux](https://en.wikipedia.org/wiki/Linux).\n\n## 2. Usage\n\nThe concept of our simple scheduler is building on shared directories.\nAll worker computers as well as the job-submitting computers must mount the shared directory of the cluster at exactly the same path the `simpleCluster.jar` must be executed in this shared directory.\n\n### 2.1. Job Submission\n\n`java -jar simpleCluster.jar submit cmd=COMMAND dir=DIRECTORY [times=TIMES] [blocksMachine]`\n\nEnter the command `COMMAND` to be executed in directory `DIRECTORY` into the job queue.\nIf `times` is specified, the command is added `TIMES` times.\nGenerally, the directory `DIRECTORY` must exist on all worker computers in the same path.\nUsually, this would be a shared, mounted directory.\nThe job executors will then execute the shell in that directory and write `COMMAND` to its stdin.\nThe parameter `times` allows you to submit the same command a number of times.\nA job tagged with `blocksMachine` will block all job execution on one machine.\nIt will only begin executing once all the threads on the machine are idle and no thread will begin or query for a new job until the blocking job is completed.\nThis is intended for jobs that either spawn their own threads or that require lots of memory or do heavy I/O and thus might disturb other jobs running in parallel.\n\n### 2.2. Job Execution\n\n`java -jar simpleCluster.jar run [cores=nCORES] [sh=/path/to/shell]`\n\nOn each worker PC, you should launch one instance of the job executor.\nIt will start the worker threads that pick up the jobs and execute them one after the other.\nVia `cores`, you can define the number of workers to launch.\nIf `cores` is not specified, the number of workers will be equal to the number of processor cores.\nIf `sh` is specified, it must be the path to the shell receiving the commands.\nIf `sh` is not specified, we will use the default shell.\nFor every command received, a new instance of the shell is launched and the command is piped to it.\nOnce the shell has terminated, the worker thread will query for the next command.\n\n### 2.3. Creating Shared Directory under Ubuntu\n\nOur scheduler is based on the use of shared directories.\nOn a server computer, a directory exists and is shared.\nThe client computers \"mount\" this shared directory under the same path.\nAll files copied to the shared directory by any actor (clients, workers, servers) who can access it will then be visible to all of them.\n\nFirst, we create a permanently shared directory on the server, which is nicely discussed [on this website](http://websiteforstudents.com/samba-setup-on-ubuntu-16-04-17-10-18-04-with-windows-systems/) and which I summarize here for Ubuntu.\n\n1. Install samba by doing `sudo apt-get install samba samba-common python-glade2 system-config-samba`.\n2. Add the directory to the samba configuration, by editing the file `/etc/samba/smb.conf` and therefore do the following:\n3. do `sudo nano /etc/samba/smb.conf`\n4. add text like this to the bottom of the file, where `/cluster` the path to directory to share, `cluster` is the name under which it will be shared, and `USER` is the user name that you will use.\n```\n[cluster]\n   path = /cluster\n   writable = yes\n   guest ok = yes\n   guest only = yes\n   read only = no\n   create mode = 0777\n   directory mode = 0777\n   force user = USER\n```\n5. if the directory does not yet exist, create it via `sudo mkdir -p /cluster`\n6. do `sudo chown -R USER /cluster`\n7. do `sudo chmod -R 0775 /cluster`\n8. close nano and store the changes\n9. do `sudo service smbd restart`\n\n\n### 2.4. Mounting Shared Directories under Ubuntu\n\nTo permanently mount a shared directory under Linux, proceed as follows.\n\n1. Create the shared directory on the main file server computer, let's call the share `cluster` as discussed in the previous section.\n2. On every single of the working computers and on the computer from which you want to submit jobs, proceed as follows:\n   1. `sudo mkdir -p /cluster` (create the local cluster directory)\n   2. `sudo chown USER /cluster`, where user is the user under which the cluster engine is executed\n   3. `sudo nano /etc/fstab` to edit the file system list\n   4. add the line `//SERVER_IP/cluster /cluster/ cifs guest,username=USER,password=PASSWORD,iocharset=utf8,file_mode=0777,dir_mode=0777,noperm 0 0`, where `SERVER_IP` be the IP-address of the file server, and `USER` and `PASSWORD` be the user name and password.\n   5. save and exit `nano`\n   6. do `sudo mount -a`\n\nYou now can access the same shared directory, `/cluster`, from your job submission PC and from all worker PCs.\nNow you should copy the `simpleCluster.jar` there.\nStart it in the Job Execution option on each worker.\nYou can now create a sub-directory for each work job under `/cluster` and put, say, shell scripts and data in there.\nFrom the job submission PC, you can then enqueue these scripts using the Job Submission mode. \n\n## 3. Licensing\n\nThis software is licensed under the GNU General Public License 3.0.\n\n## 4. Contact\n\nIf you have any questions or suggestions, please contact\n[Prof. Dr. Thomas Weise](http://iao.hfuu.edu.cn/team/director) of the\n[Institute of Applied Optimization](http://iao.hfuu.edu.cn/) at\n[Hefei University](http://www.hfuu.edu.cn) in\nHefei, Anhui, China via\nemail to [tweise@hfuu.edu.cn](mailto:tweise@hfuu.edu.cn) with CC to [tweise@ustc.edu.cn](mailto:tweise@ustc.edu.cn).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthomasweise%2Fsimplecluster","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthomasweise%2Fsimplecluster","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthomasweise%2Fsimplecluster/lists"}