{"id":18825504,"url":"https://github.com/grycap/ec4docker","last_synced_at":"2025-08-12T14:46:11.117Z","repository":{"id":118338081,"uuid":"51529045","full_name":"grycap/ec4docker","owner":"grycap","description":"EC4Docker - Elastic Cluster for Docker","archived":false,"fork":false,"pushed_at":"2016-09-23T08:50:40.000Z","size":65,"stargazers_count":8,"open_issues_count":1,"forks_count":4,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-04-14T01:44:47.676Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/grycap.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2016-02-11T16:43:03.000Z","updated_at":"2024-02-27T09:36:41.000Z","dependencies_parsed_at":"2023-07-10T14:16:12.156Z","dependency_job_id":null,"html_url":"https://github.com/grycap/ec4docker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/grycap/ec4docker","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grycap%2Fec4docker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grycap%2Fec4docker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grycap%2Fec4docker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grycap%2Fec4docker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/grycap","download_url":"https://codeload.github.com/grycap/ec4docker/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/grycap%2Fec4docker/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":270080188,"owners_count":24523673,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-12T02:00:09.011Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-08T00:59:45.179Z","updated_at":"2025-08-12T14:46:11.100Z","avatar_url":"https://github.com/grycap.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# EC4Docker (Elastic Cluster for Docker)\n\n__EC4Docker__ is a simple Elastic Cluster whose nodes are contaniers. There exists a front-end that can be accessed by ssh, and the internal _working nodes_ are powered on or off according to the needs (if the nodes are not used for a while, they are powered off, and they are powered on if they are needed).\n\nFeatures of the cluster:\n- Front end that has SSH access.\n- Passwordless SSH access from frontend to the working nodes.\n- Customizable number of working nodes.\n- Self-managed elasticity by using [CLUES](https://github.com/grycap/clues).\n- Shared filesystem from frontend to working nodes by using NFS\n\nEC4Docker may seem a bit useless because it is currently deployed on a single cluster, but consider its integration with [Docker Swarm](https://www.docker.com/products/docker-swarm) and you'll have an Elastic Cluster that is deployed over a multi-node infrastructure.\n\n## How to use it\n1. Create your front-end and working node docker images.\n2. Edit the _ec4docker.config_ file to configure the cluster.\n3. Use _setup-cluster_ script to start the cluster.\n4. Enter into the cluster.\n \n## Building the docker images\nIn first place, you need to chose the cluster manager middleware. Torque and SLURM are currently available, but you can create your own Dockerimage files according to your specific middleware.\n\nOnce selected, you need to build the build the _front-end_ and _working node_ base images by issuing the following commands:\n\n```bash\ndocker build -f frontend/Dockerfile.clues -t ec4docker:frontend ./frontend/\ndocker build -f wn/Dockerfile -t ec4docker:wn wn/\n```\n\nThen you need to create the images that correspond to the middleware:\n* For the case of Torque, you can use the following commands:\n```bash\ndocker build -f frontend/Dockerfile.torque -t ec4dtorque:frontend ./frontend/\ndocker build -f wn/Dockerfile.torque -t ec4dtorque:wn wn/\n```\n\n* For the case of SLURM, you can use the following commands:\n```bash\ndocker build -f frontend/Dockerfile.slurm -t ec4dslurm:frontend ./frontend/\ndocker build -f wn/Dockerfile.slurm -t ec4dslurm:wn wn/\n```\n\nThe images will be built and registered in your local registry.\n\nAlternatively you can build the non-elastic version: by not installing CLUES in the frontend. In order to make it, you can create the base images issuing the following commands:\n\n```bash\ndocker build -f frontend/Dockerfile.static -t ec4docker:frontend ./frontend/\ndocker build -f wn/Dockerfile -t ec4docker:wn wn/\n```\n\nIn this case you need to power the nodes on or of by hand (using the provided scripts in folder _/opt/ec4docker_).\n\n__NOTE__: you are advised to modify the Dockerfile files in order to include your libraries, applications, etc. to customize your cluster. Another option is to build the provided Dockerfiles and create your owns that start from the created one (you can check the _FROM_ clause in the Dockerfile file).\n\n## Configure the cluster\nYou should create a config file (_ec4docker.config_) to set the name of your cluster (this name will be set for the front-end node in docker), the base name for the working nodes (they should be named as _basename_1, _basename_2, etc.) and the max amount of computing nodes. You must also set the names of the docker images according to the previous step.\n\nTwo examples are provided:\n* The file _ec4docker-torque.config_ for the case of Torque:\n```bash\nEC4DOCK_SERVERNAME=ec4docker\nEC4DOCK_MAXNODES=4\nEC4DOCK_FRONTEND_IMAGENAME=ec4dtorque:frontend\nEC4DOCK_WN_IMAGENAME=ec4dtorque:wn\nEC4DOCK_NODEBASENAME=ec4dockernode\n```\n\n* And the file _ec4docker-slurm.config_ for the case of Torque:\n```bash\nEC4DOCK_SERVERNAME=ec4docker\nEC4DOCK_MAXNODES=4\nEC4DOCK_FRONTEND_IMAGENAME=ec4dslurm:frontend\nEC4DOCK_WN_IMAGENAME=ec4dslurm:wn\nEC4DOCK_NODEBASENAME=ec4dockernode\n```\n\n__NOTE__: In this file the cluster will be named _ec4docker_ and the maximum number of working nodes is set to 4. You are advised to change the name of your frontend and the amount of working nodes that will be available.\n\n## Create the cluster\nYou can use the script _setup-cluster_ to create the front-end of the cluster, from the corresponding docker image. If the cluster already exists, this script will ask you to kill it.\n\n__IMPORTANT__: In order to be able to use the NFS shared filesystem, you __MUST__ enable nfsd module in the kernel of the docker servers that hosts the containers.\n```bash\n$ modprobe nfsd\n```\n\nIn order to create your cluster, defined in _ec4docker-torque.config_ file, you can issue the following command:\n```bash\n$ ./ec4docker -ct -f ec4docker-torque.config\n```\n\n__NOTE__: The settings of the clusters are those that are set in file _ec4docker-torque.config_ file. Take note of those settings because you will need them in order to access the cluster. In special, the name of the cluster which is in _EC4DOCK_SERVERNAME_.\n\n__WARNING__: The cluster is created on a _Docker aside Docker_ approach. That means that the front-end will issue docker calls to create and to destroy the docker containers that will serve as working nodes from the cluster. But these docker containers will be created in the docker host that started the front-end. In order to use this approach, the docker communication socket and the docker binary from the host are shared with the container.\n\n## Enter the cluster\nOnce the front-end has been created you can enter into the front-end container and _su_ as the __ubuntu__ user (which is the only user created in the cluster). An example of the command like is provided next (the name of the container depends on your configuration; i.e. the _ec4docker.config_ file):\n\n```bash\n$ docker exec -it ec4docker /bin/bash\nroot@ec4docker:/$ su - ubuntu\n```\n\nAltenatively you can ssh the front-end. The SSH is exposed in the creation of the frontend, so you can guess the port where the front-end will listen by using the _docker port_ command:\n\n```bash\n$ docker port ec4docker\n22/tcp -\u003e 0.0.0.0:32770\n```\n\nIn this example, you can ssh to _ubuntu@localhost_ at port _32770_ with a command like the next one (the default password is \"ubuntu\", and it is set in the Dockerfile):\n\n```bash\n$ ssh -p 32770 ubuntu@localhost\n```\n\nNow you can issue commands to the queue, and CLUES will intercept the call and will power on some working nodes in the cluster.\n\nAn example is the next:\n```bash\n$ echo \"hostname \u0026\u0026 sleep 10\" | qsub\n1.ec4docker\n$ qstat                             \nJob id                    Name             User            Time Use S Queue\n------------------------- ---------------- --------------- -------- - -----\n1.ec4docker               STDIN            ubuntu                 0 R batch \n$ ls -l\ntotal 4\n-rw------- 1 ubuntu ubuntu  0 Feb 12 11:15 STDIN.e1\n-rw------- 1 ubuntu ubuntu 13 Feb 12 11:15 STDIN.o1\n```\n\n__NOTE__: For the non-elasic version, you can power on some nodes from inside the front-end, by hand by issuing commands like the next:\n```bash\n$ /opt/ec4docker/poweron ec4dockernode1\n$ /opt/ec4docker/poweron ec4dockernode2\n```\n\n## Troubleshooting\n\nIf any of the docker containers fail (for any reason), please check the output of the command ```docker logs \u003ccontainer\u003e```.\n\nSome common issues are:\n- __Docker fails at removing a container__ (i.e. docker rm command fails) because it is in use. In this case you __need to__ try to remove the container by hand or (under some circumnstances) restart the docker daemon.\n- __The nfsd module is not enabled__ and then the mount point for the working nodes is not enabled. Torque cannot write in the shared folder and the execution of commands fail. In this case you should try to enable nfsd module and restart the cluster in order to execute the bootstrapping process again.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgrycap%2Fec4docker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgrycap%2Fec4docker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgrycap%2Fec4docker/lists"}