{"id":13421800,"url":"https://github.com/ml-tooling/ml-workspace","last_synced_at":"2025-05-14T18:05:19.834Z","repository":{"id":37925559,"uuid":"188880197","full_name":"ml-tooling/ml-workspace","owner":"ml-tooling","description":"🛠 All-in-one web-based IDE specialized for machine learning and data science.","archived":false,"fork":false,"pushed_at":"2024-07-26T07:39:39.000Z","size":13331,"stargazers_count":3484,"open_issues_count":1,"forks_count":453,"subscribers_count":72,"default_branch":"main","last_synced_at":"2025-04-10T04:53:46.267Z","etag":null,"topics":["anaconda","data-analysis","data-science","data-visualization","deep-learning","docker","gpu","jupyter","jupyter-lab","jupyter-notebook","kubernetes","machine-learning","neural-networks","nlp","python","pytorch","r","scikit-learn","tensorflow","vscode"],"latest_commit_sha":null,"homepage":"https://mltooling.org/ml-workspace","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ml-tooling.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":".github/CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":"SECURITY.md","support":".github/SUPPORT.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-05-27T16:55:15.000Z","updated_at":"2025-04-09T08:31:29.000Z","dependencies_parsed_at":"2024-06-21T17:31:10.493Z","dependency_job_id":"8048de83-11fa-4c7e-b925-e1790446546e","html_url":"https://github.com/ml-tooling/ml-workspace","commit_stats":{"total_commits":762,"total_committers":13,"mean_commits":58.61538461538461,"dds":"0.17847769028871396","last_synced_commit":"024c4053d39241906b1becdb49b34146ac66463a"},"previous_names":[],"tags_count":34,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ml-tooling%2Fml-workspace","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ml-tooling%2Fml-workspace/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ml-tooling%2Fml-workspace/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ml-tooling%2Fml-workspace/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ml-tooling","download_url":"https://codeload.github.com/ml-tooling/ml-workspace/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254198514,"owners_count":22030965,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anaconda","data-analysis","data-science","data-visualization","deep-learning","docker","gpu","jupyter","jupyter-lab","jupyter-notebook","kubernetes","machine-learning","neural-networks","nlp","python","pytorch","r","scikit-learn","tensorflow","vscode"],"created_at":"2024-07-30T23:00:30.208Z","updated_at":"2025-05-14T18:05:14.826Z","avatar_url":"https://github.com/ml-tooling.png","language":"Jupyter Notebook","readme":"\u003ch1 align=\"center\"\u003e\n    \u003ca href=\"https://github.com/ml-tooling/ml-workspace\" title=\"ML Workspace Home\"\u003e\n    \u003cimg width=50% alt=\"\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/ml-workspace-logo.png\"\u003e \u003c/a\u003e\n    \u003cbr\u003e\n\u003c/h1\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003cstrong\u003eAll-in-one web-based development environment for machine learning\u003c/strong\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace\" title=\"Docker Image Version\"\u003e\u003cimg src=\"https://img.shields.io/docker/v/mltooling/ml-workspace?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace\" title=\"Docker Pulls\"\u003e\u003cimg src=\"https://img.shields.io/docker/pulls/mltooling/ml-workspace.svg?color=blue\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace\" title=\"Docker Image Size\"\u003e\u003cimg src=\"https://img.shields.io/docker/image-size/mltooling/ml-workspace?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://gitter.im/ml-tooling/ml-workspace\" title=\"Chat on Gitter\"\u003e\u003cimg src=\"https://badges.gitter.im/ml-tooling/ml-workspace.svg\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://mltooling.substack.com/subscribe\" title=\"Subscribe to newsletter\"\u003e\u003cimg src=\"http://bit.ly/2Md9rxM\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://twitter.com/mltooling\" title=\"Follow on Twitter\"\u003e\u003cimg src=\"https://img.shields.io/twitter/follow/mltooling.svg?style=social\u0026label=Follow\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#getting-started\"\u003eGetting Started\u003c/a\u003e •\n  \u003ca href=\"#features\"\u003eFeatures \u0026 Screenshots\u003c/a\u003e •\n  \u003ca href=\"#support\"\u003eSupport\u003c/a\u003e •\n  \u003ca href=\"https://github.com/ml-tooling/ml-workspace/issues/new?labels=bug\u0026template=01_bug-report.md\"\u003eReport a Bug\u003c/a\u003e •\n  \u003ca href=\"#faq\"\u003eFAQ\u003c/a\u003e •\n  \u003ca href=\"#known-issues\"\u003eKnown Issues\u003c/a\u003e •\n  \u003ca href=\"#contribution\"\u003eContribution\u003c/a\u003e\n\u003c/p\u003e\n\nThe ML workspace is an all-in-one web-based IDE specialized for machine learning and data science. It is simple to deploy and gets you started within minutes to productively built ML solutions on your own machines. This workspace is the ultimate tool for developers preloaded with a variety of popular data science libraries (e.g., Tensorflow, PyTorch, Keras, Sklearn) and dev tools (e.g., Jupyter, VS Code, Tensorboard) perfectly configured, optimized, and integrated.\n\n## Highlights\n\n- 💫\u0026nbsp; Jupyter, JupyterLab, and Visual Studio Code web-based IDEs.\n- 🗃\u0026nbsp; Pre-installed with many popular data science libraries \u0026 tools.\n- 🖥\u0026nbsp; Full Linux desktop GUI accessible via web browser.\n- 🔀\u0026nbsp; Seamless Git integration optimized for notebooks.\n- 📈\u0026nbsp; Integrated hardware \u0026 training monitoring via Tensorboard \u0026 Netdata.\n- 🚪\u0026nbsp; Access from anywhere via Web, SSH, or VNC under a single port.\n- 🎛\u0026nbsp; Usable as remote kernel (Jupyter) or remote machine (VS Code) via SSH.\n- 🐳\u0026nbsp; Easy to deploy on Mac, Linux, and Windows via Docker.\n\n\u003cbr\u003e\n\n## Getting Started\n\n\u003cp\u003e\n\u003ca href=\"https://labs.play-with-docker.com/?stack=https://raw.githubusercontent.com/ml-tooling/ml-workspace/main/deployment/play-with-docker/docker-compose.yml\" title=\"Docker Image Metadata\" target=\"_blank\"\u003e\u003cimg src=\"https://cdn.rawgit.com/play-with-docker/stacks/cff22438/assets/images/button.png\" alt=\"Try in PWD\" width=\"100px\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n### Prerequisites\n\nThe workspace requires **Docker** to be installed on your machine ([📖 Installation Guide](https://docs.docker.com/install/#supported-platforms)).\n\n### Start single instance\n\nDeploying a single workspace instance is as simple as:\n\n```bash\ndocker run -p 8080:8080 mltooling/ml-workspace:0.13.2\n```\n\nVoilà, that was easy! Now, Docker will pull the latest workspace image to your machine. This may take a few minutes, depending on your internet speed. Once the workspace is started, you can access it via http://localhost:8080.\n\n\u003e _If started on another machine or with a different port, make sure to use the machine's IP/DNS and/or the exposed port._\n\nTo deploy a single instance for productive usage, we recommend to apply at least the following options:\n\n```bash\ndocker run -d \\\n    -p 8080:8080 \\\n    --name \"ml-workspace\" \\\n    -v \"${PWD}:/workspace\" \\\n    --env AUTHENTICATE_VIA_JUPYTER=\"mytoken\" \\\n    --shm-size 512m \\\n    --restart always \\\n    mltooling/ml-workspace:0.13.2\n```\n\nThis command runs the container in background (`-d`), mounts your current working directory into the `/workspace` folder (`-v`), secures the workspace via a provided token (`--env AUTHENTICATE_VIA_JUPYTER`), provides 512MB of shared memory (`--shm-size`) to prevent unexpected crashes (see [known issues section](#known-issues)), and keeps the container running even on system restarts (`--restart always`). You can find additional options for docker run [here](https://docs.docker.com/engine/reference/commandline/run/) and workspace configuration options in [the section below](#Configuration).\n\n### Configuration Options\n\nThe workspace provides a variety of configuration options that can be used by setting environment variables (via docker run option: `--env`).\n\n\u003cdetails\u003e\n\u003csummary\u003eConfiguration options (click to expand...)\u003c/summary\u003e\n\n\u003ctable\u003e\n    \u003ctr\u003e\n        \u003cth\u003eVariable\u003c/th\u003e\n        \u003cth\u003eDescription\u003c/th\u003e\n        \u003cth\u003eDefault\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eWORKSPACE_BASE_URL\u003c/td\u003e\n        \u003ctd\u003eThe base URL under which Jupyter and all other tools will be reachable from.\u003c/td\u003e\n        \u003ctd\u003e/\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eWORKSPACE_SSL_ENABLED\u003c/td\u003e\n        \u003ctd\u003eEnable or disable SSL. When set to true, either certificate (cert.crt) must be mounted to \u003ccode\u003e/resources/ssl\u003c/code\u003e or, if not, the container generates self-signed certificate.\u003c/td\u003e\n        \u003ctd\u003efalse\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eWORKSPACE_AUTH_USER\u003c/td\u003e\n        \u003ctd\u003eBasic auth user name. To enable basic auth, both the user and password need to be set. We recommend to use the \u003ccode\u003eAUTHENTICATE_VIA_JUPYTER\u003c/code\u003e for securing the workspace.\u003c/td\u003e\n        \u003ctd\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eWORKSPACE_AUTH_PASSWORD\u003c/td\u003e\n        \u003ctd\u003eBasic auth user password. To enable basic auth, both the user and password need to be set. We recommend to use the \u003ccode\u003eAUTHENTICATE_VIA_JUPYTER\u003c/code\u003e for securing the workspace.\u003c/td\u003e\n        \u003ctd\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eWORKSPACE_PORT\u003c/td\u003e\n        \u003ctd\u003eConfigures the main container-internal port of the workspace proxy. For most scenarios, this configuration should not be changed, and the port configuration via Docker should be used instead of the workspace should be accessible from a different port.\u003c/td\u003e\n        \u003ctd\u003e8080\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eCONFIG_BACKUP_ENABLED\u003c/td\u003e\n        \u003ctd\u003eAutomatically backup and restore user configuration to the persisted \u003ccode\u003e/workspace\u003c/code\u003e folder, such as the .ssh, .jupyter, or .gitconfig from the users home directory.\u003c/td\u003e\n        \u003ctd\u003etrue\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eSHARED_LINKS_ENABLED\u003c/td\u003e\n        \u003ctd\u003eEnable or disable the capability to share resources via external links. This is used to enable file sharing, access to workspace-internal ports, and easy command-based SSH setup. All shared links are protected via a token. However, there are certain risks since the token cannot be easily invalidated after sharing and does not expire.\u003c/td\u003e\n        \u003ctd\u003etrue\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eINCLUDE_TUTORIALS\u003c/td\u003e\n        \u003ctd\u003eIf \u003ccode\u003etrue\u003c/code\u003e, a selection of tutorial and introduction notebooks are added to the \u003ccode\u003e/workspace\u003c/code\u003e folder at container startup, but only if the folder is empty.\u003c/td\u003e\n        \u003ctd\u003etrue\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eMAX_NUM_THREADS\u003c/td\u003e\n        \u003ctd\u003eThe number of threads used for computations when using various common libraries (MKL, OPENBLAS, OMP, NUMBA, ...). You can also use \u003ccode\u003eauto\u003c/code\u003e to let the workspace dynamically determine the number of threads based on available CPU resources. This configuration can be overwritten by the user from within the workspace. Generally, it is good to set it at or below the number of CPUs available to the workspace.\u003c/td\u003e\n        \u003ctd\u003eauto\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd colspan=\"3\"\u003e\u003cb\u003eJupyter Configuration:\u003c/b\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eSHUTDOWN_INACTIVE_KERNELS\u003c/td\u003e\n        \u003ctd\u003eAutomatically shutdown inactive kernels after a given timeout (to clean up memory or GPU resources). Value can be either a timeout in seconds or set to \u003ccode\u003etrue\u003c/code\u003e with a default value of 48h.\u003c/td\u003e\n        \u003ctd\u003efalse\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eAUTHENTICATE_VIA_JUPYTER\u003c/td\u003e\n        \u003ctd\u003eIf \u003ccode\u003etrue\u003c/code\u003e, all HTTP requests will be authenticated against the Jupyter server, meaning that the authentication method configured with Jupyter will be used for all other tools as well. This can be deactivated with \u003ccode\u003efalse\u003c/code\u003e. Any other value will activate this authentication and are applied as token via NotebookApp.token configuration of Jupyter.\u003c/td\u003e\n        \u003ctd\u003efalse\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eNOTEBOOK_ARGS\u003c/td\u003e\n        \u003ctd\u003eAdd and overwrite Jupyter configuration options via command line args. Refer to \u003ca href=\"https://jupyter-notebook.readthedocs.io/en/stable/config.html\"\u003ethis overview\u003c/a\u003e for all options.\u003c/td\u003e\n        \u003ctd\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c/details\u003e\n\n### Persist Data\n\nTo persist the data, you need to mount a volume into `/workspace` (via docker run option: `-v`).\n\n\u003cdetails\u003e\n\u003csummary\u003eDetails (click to expand...)\u003c/summary\u003e\n\nThe default work directory within the container is `/workspace`, which is also the root directory of the Jupyter instance. The `/workspace` directory is intended to be used for all the important work artifacts. Data within other directories of the server (e.g., `/root`) might get lost at container restarts.\n\u003c/details\u003e\n\n### Enable Authentication\n\nWe strongly recommend enabling authentication via one of the following two options. For both options, the user will be required to authenticate for accessing any of the pre-installed tools.\n\n\u003e _The authentication only works for all tools accessed through the main workspace port (default: `8080`). This works for all preinstalled tools and the [Access Ports](#access-ports) feature. If you expose another port of the container, please make sure to secure it with authentication as well!_\n\n\u003cdetails\u003e\n\u003csummary\u003eDetails (click to expand...)\u003c/summary\u003e\n\n#### Token-based Authentication via Jupyter (recommended)\n\nActivate the token-based authentication based on the authentication implementation of Jupyter via the `AUTHENTICATE_VIA_JUPYTER` variable:\n\n```bash\ndocker run -p 8080:8080 --env AUTHENTICATE_VIA_JUPYTER=\"mytoken\" mltooling/ml-workspace:0.13.2\n```\n\nYou can also use `\u003cgenerated\u003e` to let Jupyter generate a random token that is printed out on the container logs. A value of `true` will not set any token but activate that every request to any tool in the workspace will be checked with the Jupyter instance if the user is authenticated. This is used for tools like JupyterHub, which configures its own way of authentication.\n\n#### Basic Authentication via Nginx\n\nActivate the basic authentication via the `WORKSPACE_AUTH_USER` and `WORKSPACE_AUTH_PASSWORD` variable:\n\n```bash\ndocker run -p 8080:8080 --env WORKSPACE_AUTH_USER=\"user\" --env WORKSPACE_AUTH_PASSWORD=\"pwd\" mltooling/ml-workspace:0.13.2\n```\n\nThe basic authentication is configured via the nginx proxy and might be more performant compared to the other option since with `AUTHENTICATE_VIA_JUPYTER` every request to any tool in the workspace will check via the Jupyter instance if the user (based on the request cookies) is authenticated.\n\n\u003c/details\u003e\n\n### Enable SSL/HTTPS\n\nWe recommend enabling SSL so that the workspace is accessible via HTTPS (encrypted communication). SSL encryption can be activated via the `WORKSPACE_SSL_ENABLED` variable. \n\n\u003cdetails\u003e\n\u003csummary\u003eDetails (click to expand...)\u003c/summary\u003e\n\nWhen set to `true`, either the `cert.crt` and `cert.key` file must be mounted to `/resources/ssl` or, if the certificate files do not exist, the container generates self-signed certificates. For example, if the `/path/with/certificate/files` on the local system contains a valid certificate for the host domain (`cert.crt` and `cert.key` file), it can be used from the workspace as shown below:\n\n```bash\ndocker run \\\n    -p 8080:8080 \\\n    --env WORKSPACE_SSL_ENABLED=\"true\" \\\n    -v /path/with/certificate/files:/resources/ssl:ro \\\n    mltooling/ml-workspace:0.13.2\n```\n\nIf you want to host the workspace on a public domain, we recommend to use [Let's encrypt](https://letsencrypt.org/getting-started/) to get a trusted certificate for your domain.  To use the generated certificate (e.g., via [certbot](https://certbot.eff.org/) tool) for the workspace, the `privkey.pem` corresponds to the `cert.key` file and the `fullchain.pem` to the `cert.crt` file.\n\n\u003e _When you enable SSL support, you must access the workspace over `https://`, not over plain `http://`._\n\n\u003c/details\u003e\n\n### Limit Memory \u0026 CPU\n\nBy default, the workspace container has no resource constraints and can use as much of a given resource as the host’s kernel scheduler allows. Docker provides ways to control how much memory, or CPU a container can use, by setting runtime configuration flags of the docker run command.\n\n\u003e _The workspace requires atleast 2 CPUs and 500MB to run stable and be usable._\n\n\u003cdetails\u003e\n\u003csummary\u003eDetails (click to expand...)\u003c/summary\u003e\n\nFor example, the following command restricts the workspace to only use a maximum of 8 CPUs, 16 GB of memory, and 1 GB of shared memory (see [Known Issues](#known-issues)):\n\n```bash\ndocker run -p 8080:8080 --cpus=8 --memory=16g --shm-size=1G mltooling/ml-workspace:0.13.2\n```\n\n\u003e 📖 _For more options and documentation on resource constraints, please refer to the [official docker guide](https://docs.docker.com/config/containers/resource_constraints/)._\n\n\u003c/details\u003e\n\n### Enable Proxy\n\nIf a proxy is required, you can pass the proxy configuration via the `HTTP_PROXY`, `HTTPS_PROXY`, and `NO_PROXY` environment variables.\n\n### Workspace Flavors\n\nIn addition to the main workspace image (`mltooling/ml-workspace`), we provide other image flavors that extend the features or minimize the image size to support a variety of use cases.\n\n#### Minimal Flavor\n\n\u003cp\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace\" title=\"Docker Image Version\"\u003e\u003cimg src=\"https://img.shields.io/docker/v/mltooling/ml-workspace?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-minimal\" title=\"Docker Image Size\"\u003e\u003cimg src=\"https://img.shields.io/docker/image-size/mltooling/ml-workspace-minimal?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-minimal\" title=\"Docker Pulls\"\u003e\u003cimg src=\"https://img.shields.io/docker/pulls/mltooling/ml-workspace-minimal.svg\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eDetails (click to expand...)\u003c/summary\u003e\n\nThe minimal flavor (`mltooling/ml-workspace-minimal`) is our smallest image that contains most of the tools and features described in the [features section](#features) without most of the python libraries that are pre-installed in our main image. Any Python library or excluded tool can be installed manually during runtime by the user.\n\n```bash\ndocker run -p 8080:8080 mltooling/ml-workspace-minimal:0.13.2\n```\n\u003c/details\u003e\n\n#### R Flavor\n\n\u003cp\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-r\" title=\"Docker Image Version\"\u003e\u003cimg src=\"https://img.shields.io/docker/v/mltooling/ml-workspace-r?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-r\" title=\"Docker Image Size\"\u003e\u003cimg src=\"https://img.shields.io/docker/image-size/mltooling/ml-workspace-r?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-r\" title=\"Docker Pulls\"\u003e\u003cimg src=\"https://img.shields.io/docker/pulls/mltooling/ml-workspace-r.svg\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eDetails (click to expand...)\u003c/summary\u003e\n\nThe R flavor (`mltooling/ml-workspace-r`) is based on our default workspace image and extends it with the R-interpreter, R-Jupyter kernel, RStudio server (access via `Open Tool -\u003e RStudio`), and a variety of popular packages from the R ecosystem.\n\n```bash\ndocker run -p 8080:8080 mltooling/ml-workspace-r:0.12.1\n```\n\u003c/details\u003e\n\n#### Spark Flavor\n\n\u003cp\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-spark\" title=\"Docker Image Version\"\u003e\u003cimg src=\"https://img.shields.io/docker/v/mltooling/ml-workspace-spark?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-spark\" title=\"Docker Image Size\"\u003e\u003cimg src=\"https://img.shields.io/docker/image-size/mltooling/ml-workspace-spark?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-spark\" title=\"Docker Pulls\"\u003e\u003cimg src=\"https://img.shields.io/docker/pulls/mltooling/ml-workspace-spark.svg\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eDetails (click to expand...)\u003c/summary\u003e\n\nThe Spark flavor (`mltooling/ml-workspace-spark`) is based on our R-flavor workspace image and extends it with the Spark runtime, Spark-Jupyter kernel, Zeppelin Notebook (access via `Open Tool -\u003e Zeppelin`), PySpark, Hadoop, Java Kernel, and a few additional libraries \u0026 Jupyter extensions.\n\n```bash\ndocker run -p 8080:8080 mltooling/ml-workspace-spark:0.12.1\n```\n\n\u003c/details\u003e\n\n#### GPU Flavor\n\n\u003cp\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-gpu\" title=\"Docker Image Version\"\u003e\u003cimg src=\"https://img.shields.io/docker/v/mltooling/ml-workspace-gpu?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-gpu\" ttitle=\"Docker Image Size\"\u003e\u003cimg src=\"https://img.shields.io/docker/image-size/mltooling/ml-workspace-gpu?color=blue\u0026sort=semver\"\u003e\u003c/a\u003e\n\u003ca href=\"https://hub.docker.com/r/mltooling/ml-workspace-gpu\" title=\"Docker Pulls\"\u003e\u003cimg src=\"https://img.shields.io/docker/pulls/mltooling/ml-workspace-gpu.svg\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eDetails (click to expand...)\u003c/summary\u003e\n\n\u003e _Currently, the GPU-flavor only supports CUDA 11.2. Support for other CUDA versions might be added in the future._\n\nThe GPU flavor (`mltooling/ml-workspace-gpu`) is based on our default workspace image and extends it with CUDA 10.1 and GPU-ready versions of various machine learning libraries (e.g., tensorflow, pytorch, cntk, jax). This GPU image has the following additional requirements for the system:\n\n- Nvidia Drivers for the GPUs. Drivers need to be CUDA 11.2 compatible, version `\u003e=460.32.03` ([📖 Instructions](https://github.com/NVIDIA/nvidia-docker/wiki/Frequently-Asked-Questions#how-do-i-install-the-nvidia-driver)).\n- (Docker \u003e= 19.03) Nvidia Container Toolkit ([📖 Instructions](https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(Native-GPU-Support))).\n\n```bash\ndocker run -p 8080:8080 --gpus all mltooling/ml-workspace-gpu:0.13.2\n```\n\n- (Docker \u003c 19.03) Nvidia Docker 2.0 ([📖 Instructions](https://github.com/NVIDIA/nvidia-docker/wiki/Installation-(version-2.0))).\n\n```bash\ndocker run -p 8080:8080 --runtime nvidia --env NVIDIA_VISIBLE_DEVICES=\"all\" mltooling/ml-workspace-gpu:0.13.2\n```\n\nThe GPU flavor also comes with a few additional configuration options, as explained below:\n\n\u003ctable\u003e\n    \u003ctr\u003e\n        \u003cth\u003eVariable\u003c/th\u003e\n        \u003cth\u003eDescription\u003c/th\u003e\n        \u003cth\u003eDefault\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eNVIDIA_VISIBLE_DEVICES\u003c/td\u003e\n        \u003ctd\u003eControls which GPUs will be accessible inside the workspace. By default, all GPUs from the host are accessible within the workspace. You can either use \u003ccode\u003eall\u003c/code\u003e, \u003ccode\u003enone\u003c/code\u003e, or specify a comma-separated list of device IDs (e.g., \u003ccode\u003e0,1\u003c/code\u003e). You can find out the list of available device IDs by running \u003ccode\u003envidia-smi\u003c/code\u003e on the host machine.\u003c/td\u003e\n        \u003ctd\u003eall\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eCUDA_VISIBLE_DEVICES\u003c/td\u003e\n        \u003ctd\u003eControls which GPUs CUDA applications running inside the workspace will see. By default, all GPUs that the workspace has access to will be visible. To restrict applications, provide a comma-separated list of internal device IDs (e.g., \u003ccode\u003e0,2\u003c/code\u003e) based on the available devices within the workspace (run \u003ccode\u003envidia-smi\u003c/code\u003e). In comparison to \u003ccode\u003eNVIDIA_VISIBLE_DEVICES\u003c/code\u003e, the workspace user will be still able to access other GPUs by overwriting this configuration from within the workspace.\u003c/td\u003e\n        \u003ctd\u003e\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eTF_FORCE_GPU_ALLOW_GROWTH\u003c/td\u003e\n        \u003ctd\u003eBy default, the majority of GPU memory will be allocated by the first execution of a TensorFlow graph. While this behavior can be desirable for production pipelines, it is less desirable for interactive use. Use \u003ccode\u003etrue\u003c/code\u003e to enable dynamic GPU Memory allocation or \u003ccode\u003efalse\u003c/code\u003e to instruct TensorFlow to allocate all memory at execution.\u003c/td\u003e\n        \u003ctd\u003etrue\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\u003c/details\u003e\n\n### Multi-user setup\n\nThe workspace is designed as a single-user development environment. For a multi-user setup, we recommend deploying [🧰 ML Hub](https://github.com/ml-tooling/ml-hub). ML Hub is based on JupyterHub with the task to spawn, manage, and proxy workspace instances for multiple users.\n\n\u003cdetails\u003e\n\u003csummary\u003eDeployment (click to expand...)\u003c/summary\u003e\n\nML Hub makes it easy to set up a multi-user environment on a single server (via Docker) or a cluster (via Kubernetes) and supports a variety of usage scenarios \u0026 authentication providers. You can try out ML Hub via:\n\n```bash\ndocker run -p 8080:8080 -v /var/run/docker.sock:/var/run/docker.sock mltooling/ml-hub:latest\n```\n\nFor more information and documentation about ML Hub, please take a look at the [Github Site](https://github.com/ml-tooling/ml-hub).\n\n\u003c/details\u003e\n\n---\n\n\u003cbr\u003e\n\n## Support\n\nThis project is maintained by [Benjamin Räthlein](https://twitter.com/raethlein), [Lukas Masuch](https://twitter.com/LukasMasuch), and [Jan Kalkan](https://www.linkedin.com/in/jan-kalkan-b5390284/). Please understand that we won't be able to provide individual support via email. We also believe that help is much more valuable if it's shared publicly so that more people can benefit from it.\n\n| Type                     | Channel                                              |\n| ------------------------ | ------------------------------------------------------ |\n| 🚨\u0026nbsp; **Bug Reports**       | \u003ca href=\"https://github.com/ml-tooling/ml-workspace/issues?utf8=%E2%9C%93\u0026q=is%3Aopen+is%3Aissue+label%3Abug+sort%3Areactions-%2B1-desc+\" title=\"Open Bug Report\"\u003e\u003cimg src=\"https://img.shields.io/github/issues/ml-tooling/ml-workspace/bug.svg\"\u003e\u003c/a\u003e                                  |\n| 🎁\u0026nbsp; **Feature Requests**  | \u003ca href=\"https://github.com/ml-tooling/ml-workspace/issues?q=is%3Aopen+is%3Aissue+label%3Afeature+sort%3Areactions-%2B1-desc\" title=\"Open Feature Request\"\u003e\u003cimg src=\"https://img.shields.io/github/issues/ml-tooling/ml-workspace/feature.svg?label=feature%20request\"\u003e\u003c/a\u003e                                 |\n| 👩‍💻\u0026nbsp; **Usage Questions**   |  \u003ca href=\"https://github.com/ml-tooling/ml-workspace/issues?q=is%3Aopen+is%3Aissue+label%3Asupport+sort%3Areactions-%2B1-desc\" title=\"Open Support Request\"\u003e \u003cimg src=\"https://img.shields.io/github/issues/ml-tooling/ml-workspace/support.svg?label=support%20request\"\u003e\u003c/a\u003e \u003ca href=\"https://stackoverflow.com/questions/tagged/ml-tooling\" title=\"Open Question on Stackoverflow\"\u003e \u003cimg src=\"https://img.shields.io/badge/stackoverflow-ml--tooling-orange.svg\"\u003e\u003c/a\u003e \u003ca href=\"https://gitter.im/ml-tooling/ml-workspace\" title=\"Chat on Gitter\"\u003e\u003cimg src=\"https://badges.gitter.im/ml-tooling/ml-workspace.svg\"\u003e\u003c/a\u003e |\n| 📢\u0026nbsp; **Announcements** | \u003ca href=\"https://gitter.im/ml-tooling/ml-workspace\" title=\"Chat on Gitter\"\u003e\u003cimg src=\"https://badges.gitter.im/ml-tooling/ml-workspace.svg\"\u003e\u003c/a\u003e \u003ca href=\"https://mltooling.substack.com/subscribe\" title=\"Subscribe for updates\"\u003e\u003cimg src=\"http://bit.ly/2Md9rxM\"\u003e\u003c/a\u003e \u003ca href=\"https://twitter.com/mltooling\" title=\"ML Tooling on Twitter\"\u003e\u003cimg src=\"https://img.shields.io/twitter/follow/mltooling.svg?style=social\u0026label=Follow\"\u003e |\n| ❓\u0026nbsp; **Other Requests** | \u003ca href=\"mailto:team@mltooling.org\" title=\"Email ML Tooling Team\"\u003e\u003cimg src=\"https://img.shields.io/badge/email-ML Tooling-green?logo=mail.ru\u0026logoColor=white\"\u003e\u003c/a\u003e |\n\n---\n\n\u003cbr\u003e\n\n## Features\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"#jupyter\"\u003eJupyter\u003c/a\u003e •\n  \u003ca href=\"#desktop-gui\"\u003eDesktop GUI\u003c/a\u003e •\n  \u003ca href=\"#visual-studio-code\"\u003eVS Code\u003c/a\u003e •\n  \u003ca href=\"#jupyterlab\"\u003eJupyterLab\u003c/a\u003e •\n  \u003ca href=\"#git-integration\"\u003eGit Integration\u003c/a\u003e •\n  \u003ca href=\"#file-sharing\"\u003eFile Sharing\u003c/a\u003e •\n  \u003ca href=\"#access-ports\"\u003eAccess Ports\u003c/a\u003e •\n  \u003ca href=\"#tensorboard\"\u003eTensorboard\u003c/a\u003e •\n  \u003ca href=\"#extensibility\"\u003eExtensibility\u003c/a\u003e •\n  \u003ca href=\"#hardware-monitoring\"\u003eHardware Monitoring\u003c/a\u003e •\n  \u003ca href=\"#ssh-access\"\u003eSSH Access\u003c/a\u003e •\n  \u003ca href=\"#remote-development\"\u003eRemote Development\u003c/a\u003e •\n  \u003ca href=\"#run-as-a-job\"\u003eJob Execution\u003c/a\u003e\n\u003c/p\u003e\n\nThe workspace is equipped with a selection of best-in-class open-source development tools to help with the machine learning workflow. Many of these tools can be started from the `Open Tool` menu from Jupyter (the main application of the workspace):\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/open-tools.png\"/\u003e\n\n\u003e _Within your workspace you have **full root \u0026 sudo privileges** to install any library or tool you need via terminal (e.g., `pip`, `apt-get`, `conda`, or `npm`). You can find more ways to extend the workspace within the [Extensibility](#extensibility) section_\n\n### Jupyter\n\n[Jupyter Notebook](https://jupyter.org/) is a web-based interactive environment for writing and running code. The main building blocks of Jupyter are the file-browser, the notebook editor, and kernels. The file-browser provides an interactive file manager for all notebooks, files, and folders in the `/workspace` directory.\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/jupyter-tree.png\"/\u003e\n\nA new notebook can be created by clicking on the `New` drop-down button at the top of the list and selecting the desired language kernel.\n\n\u003e _You can spawn interactive **terminal** instances as well by selecting `New -\u003e Terminal` in the file-browser._\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/jupyter-notebook.png\"/\u003e\n\nThe notebook editor enables users to author documents that include live code, markdown text, shell commands, LaTeX equations, interactive widgets, plots, and images. These notebook documents provide a complete and self-contained record of a computation that can be converted to various formats and shared with others.\n\n\u003e _This workspace has a variety of **third-party Jupyter extensions** activated. You can configure these extensions in the nbextensions configurator: `nbextensions` tab on the file browser_\n\nThe Notebook allows code to be run in a range of different programming languages. For each notebook document that a user opens, the web application starts a **kernel** that runs the code for that notebook and returns output. This workspace has a Python 3 kernel pre-installed. Additional Kernels can be installed to get access to other languages (e.g., R, Scala, Go) or additional computing resources (e.g., GPUs, CPUs, Memory).\n\n\u003e _**Python 2** is deprected and we do not recommend to use it. However, you can still install a Python 2.7 kernel via this command: `/bin/bash /resources/tools/python-27.sh`_\n\n### Desktop GUI\n\nThis workspace provides an HTTP-based VNC access to the workspace via [noVNC](https://github.com/novnc/noVNC). Thereby, you can access and work within the workspace with a fully-featured desktop GUI. To access this desktop GUI, go to `Open Tool`, select `VNC`, and click the `Connect` button. In the case you are asked for a password, use `vncpassword`.\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/desktop-vnc.png\"/\u003e\n\nOnce you are connected, you will see a desktop GUI that allows you to install and use full-fledged web-browsers or any other tool that is available for Ubuntu. Within the `Tools` folder on the desktop, you will find a collection of install scripts that makes it straightforward to install some of the most commonly used development tools, such as Atom, PyCharm, R-Runtime, R-Studio, or Postman (just double-click on the script).\n\n**Clipboard:** If you want to share the clipboard between your machine and the workspace, you can use the copy-paste functionality as described below:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/desktop-vnc-clipboard.png\"/\u003e\n\n\u003e 💡 _**Long-running tasks:** Use the desktop GUI for long-running Jupyter executions. By running notebooks from the browser of your workspace desktop GUI, all output will be synchronized to the notebook even if you have disconnected your browser from the notebook._\n\n### Visual Studio Code\n\n[Visual Studio Code](https://github.com/microsoft/vscode) (`Open Tool -\u003e VS Code`) is an open-source lightweight but powerful code editor with built-in support for a variety of languages and a rich ecosystem of extensions. It combines the simplicity of a source code editor with powerful developer tooling, like IntelliSense code completion and debugging. The workspace integrates VS Code as a web-based application accessible through the browser-based on the awesome [code-server](https://github.com/cdr/code-server) project. It allows you to customize every feature to your liking and install any number of third-party extensions.\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/vs-code.png\"/\u003e\u003c/p\u003e\n\nThe workspace also provides a VS Code integration into Jupyter allowing you to open a VS Code instance for any selected folder, as shown below:\n\n\u003cp align=\"center\"\u003e\u003cimg src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/vs-code-open.png\"/\u003e\u003c/p\u003e\n\n### JupyterLab\n\n[JupyterLab](https://github.com/jupyterlab/jupyterlab) (`Open Tool -\u003e JupyterLab`) is the next-generation user interface for Project Jupyter. It offers all the familiar building blocks of the classic Jupyter Notebook (notebook, terminal, text editor, file browser, rich outputs, etc.) in a flexible and powerful user interface. This JupyterLab instance comes pre-installed with a few helpful extensions such as a the [jupyterlab-toc](https://github.com/jupyterlab/jupyterlab-toc), [jupyterlab-git](https://github.com/jupyterlab/jupyterlab-git), and [juptyterlab-tensorboard](https://github.com/chaoleili/jupyterlab_tensorboard).\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/jupyterlab.png\"/\u003e\n\n### Git Integration\n\nVersion control is a crucial aspect of productive collaboration. To make this process as smooth as possible, we have integrated a custom-made Jupyter extension specialized on pushing single notebooks, a full-fledged web-based Git client ([ungit](https://github.com/FredrikNoren/ungit)), a tool to open and edit plain text documents (e.g., `.py`, `.md`) as notebooks ([jupytext](https://github.com/mwouts/jupytext)), as well as a notebook merging tool ([nbdime](https://github.com/jupyter/nbdime)). Additionally, JupyterLab and VS Code also provide GUI-based Git clients.\n\n#### Clone Repository\n\nFor cloning repositories via `https`, we recommend to navigate to the desired root folder and to click on the `git` button as shown below:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/git-open.png\"/\u003e\n\nThis might ask for some required settings and, subsequently, opens [ungit](https://github.com/FredrikNoren/ungit), a web-based Git client with a clean and intuitive UI that makes it convenient to sync your code artifacts. Within ungit, you can clone any repository. If authentication is required, you will get asked for your credentials.\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/git-ungit-credentials.png\"/\u003e\n\n#### Push, Pull, Merge, and Other Git Actions\n\nTo commit and push a single notebook to a remote Git repository, we recommend to use the Git plugin integrated into Jupyter, as shown below:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/git-push-notebook.png\"/\u003e\n\nFor more advanced Git operations, we recommend to use [ungit](https://github.com/FredrikNoren/ungit). With ungit, you can do most of the common git actions such as push, pull, merge, branch, tag, checkout, and many more.\n\n#### Diffing and Merging Notebooks\n\nJupyter notebooks are great, but they often are huge files, with a very specific JSON file format. To enable seamless diffing and merging via Git this workspace is pre-installed with [nbdime](https://github.com/jupyter/nbdime). Nbdime understands the structure of notebook documents and, therefore, automatically makes intelligent decisions when diffing and merging notebooks. In the case you have merge conflicts, nbdime will make sure that the notebook is still readable by Jupyter, as shown below:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/git-nbdime-merging.png\"/\u003e\n\nFurthermore, the workspace comes pre-installed with [jupytext](https://github.com/mwouts/jupytext), a Jupyter plugin that reads and writes notebooks as plain text files. This allows you to open, edit, and run scripts or markdown files (e.g., `.py`, `.md`) as notebooks within Jupyter. In the following screenshot, we have opened a markdown file via Jupyter:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/git-jupytext.png\"/\u003e\n\nIn combination with Git, jupytext enables a clear diff history and easy merging of version conflicts. With both of those tools, collaborating on Jupyter notebooks with Git becomes straightforward.\n\n### File Sharing\n\nThe workspace has a feature to share any file or folder with anyone via a token-protected link. To share data via a link, select any file or folder from the Jupyter directory tree and click on the share button as shown in the following screenshot:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/file-sharing-open.png\"/\u003e\n\nThis will generate a unique link protected via a token that gives anyone with the link access to view and download the selected data via the [Filebrowser](https://github.com/filebrowser/filebrowser) UI:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/file-sharing-filebrowser.png\"/\u003e\n\nTo deactivate or manage (e.g., provide edit permissions) shared links, open the Filebrowser via `Open Tool -\u003e Filebrowser` and select `Settings-\u003eUser Management`.\n\n### Access Ports\n\nIt is possible to securely access any workspace internal port by selecting `Open Tool -\u003e Access Port`. With this feature, you are able to access a REST API or web application running inside the workspace directly with your browser. The feature enables developers  to build, run, test, and debug REST APIs or web applications directly from the workspace.\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/access-port.png\"/\u003e\n\nIf you want to use an HTTP client or share access to a given port, you can select the `Get shareable link` option. This generates a token-secured link that anyone with access to the link can use to access the specified port.\n\n\u003e _The HTTP app requires to be resolved from a relative URL path or configure a base path (`/tools/PORT/`). Tools made accessible this way are secured by the workspace's authentication system! If you decide to publish any other port of the container yourself instead of using this feature to make a tool accessible, please make sure to secure it via an authentication mechanism!_\n\n\u003cdetails\u003e\n\n\u003csummary\u003eExample (click to expand...)\u003c/summary\u003e\n\n1. Start an HTTP server on port `1234` by running this command in a terminal within the workspace: `python -m http.server 1234`\n2. Select `Open Tool -\u003e Access Port`, input port `1234`, and select the `Get shareable link` option.\n3. Click `Access`, and you will see the content provided by Python's `http.server`.\n4. The opened link can also be shared to other people or called from external applications (e.g., try with Incognito Mode in Chrome).\n\n\u003c/details\u003e\n\n### SSH Access\n\nSSH provides a powerful set of features that enables you to be more productive with your development tasks. You can easily set up a secure and passwordless SSH connection to a workspace by selecting `Open Tool -\u003e SSH`. This will generate a secure setup command that can be run on any Linux or Mac machine to configure a passwordless \u0026 secure SSH connection to the workspace. Alternatively, you can also download the setup script and run it (instead of using the command).\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/ssh-access.png\"/\u003e\n\n\u003e _The setup script only runs on Mac and Linux. Windows is currently not supported._\n\nJust run the setup command or script on the machine from where you want to setup a connection to the workspace and input a name for the connection (e.g., `my-workspace`). You might also get asked for some additional input during the process, e.g. to install a remote kernel if `remote_ikernel` is installed. Once the passwordless SSH connection is successfully setup and tested, you can securely connect to the workspace by simply executing `ssh my-workspace`.\n\nBesides the ability to execute commands on a remote machine, SSH also provides a variety of other features that can improve your development workflow as described in the following sections.\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eTunnel Ports\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nAn SSH connection can be used for tunneling application ports from the remote machine to the local machine, or vice versa. For example, you can expose the workspace internal port `5901` (VNC Server) to the local machine on port `5000` by executing:\n\n```bash\nssh -nNT -L 5000:localhost:5901 my-workspace\n```\n\n\u003e _To expose an application port from your local machine to a workspace, use the `-R` option (instead of `-L`)._\n\nAfter the tunnel is established, you can use your favorite VNC viewer on your local machine and connect to `vnc://localhost:5000` (default password: `vncpassword`). To make the tunnel connection more resistant and reliable, we recommend to use [autossh](https://www.harding.motd.ca/autossh/) to automatically restart SSH tunnels in the case that the connection dies:\n\n```bash\nautossh -M 0 -f -nNT -L 5000:localhost:5901 my-workspace\n```\n\nPort tunneling is quite useful when you have started any server-based tool within the workspace that you like to make accessible for another machine. In its default setting, the workspace has a variety of tools already running on different ports, such as:\n\n- `8080`: Main workspace port with access to all integrated tools.\n- `8090`: Jupyter server.\n- `8054`: VS Code server.\n- `5901`: VNC server.\n- `22`: SSH server.\n\nYou can find port information on all the tools in the [supervisor configuration](https://github.com/ml-tooling/ml-workspace/blob/main/resources/supervisor/supervisord.conf).\n\n\u003e 📖 _For more information about port tunneling/forwarding, we recommend [this guide](https://www.everythingcli.org/ssh-tunnelling-for-fun-and-profit-local-vs-remote/)._\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eCopy Data via SCP\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\n[SCP](https://linux.die.net/man/1/scp) allows files and directories to be securely copied to, from, or between different machines via SSH connections. For example, to copy a local file (`./local-file.txt`) into the `/workspace` folder inside the workspace, execute:\n\n```bash\nscp ./local-file.txt my-workspace:/workspace\n```\n\nTo copy the `/workspace` directory from `my-workspace` to the working directory of the local machine, execute:\n\n```bash\nscp -r my-workspace:/workspace .\n```\n\n\u003e 📖 _For more information about scp, we recommend [this guide](https://www.garron.me/en/articles/scp.html)._\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eSync Data via Rsync\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\n[Rsync](https://linux.die.net/man/1/rsync) is a utility for efficiently transferring and synchronizing files between different machines (e.g., via SSH connections) by comparing the modification times and sizes of files. The rsync command will determine which files need to be updated each time it is run, which is far more efficient and convenient than using something like scp or sftp. For example, to sync all content of a local folder (`./local-project-folder/`) into the `/workspace/remote-project-folder/` folder inside the workspace, execute:\n\n```bash\nrsync -rlptzvP --delete --exclude=\".git\" \"./local-project-folder/\" \"my-workspace:/workspace/remote-project-folder/\"\n```\n\nIf you have some changes inside the folder on the workspace, you can sync those changes back to the local folder by changing the source and destination arguments:\n\n```bash\nrsync -rlptzvP --delete --exclude=\".git\" \"my-workspace:/workspace/remote-project-folder/\" \"./local-project-folder/\"\n```\n\nYou can rerun these commands each time you want to synchronize the latest copy of your files. Rsync will make sure that only updates will be transferred.\n\n\u003e 📖 _You can find more information about rsync on [this man page](https://linux.die.net/man/1/rsync)._\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eMount Folders via SSHFS\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nBesides copying and syncing data, an SSH connection can also be used to mount directories from a remote machine into the local filesystem via [SSHFS](https://github.com/libfuse/sshfs). \nFor example, to mount the `/workspace` directory of `my-workspace` into a local path (e.g. `/local/folder/path`), execute:\n\n```bash\nsshfs -o reconnect my-workspace:/workspace /local/folder/path\n```\n\nOnce the remote directory is mounted, you can interact with the remote file system the same way as with any local directory and file.\n\n\u003e 📖 _For more information about sshfs, we recommend [this guide](https://www.digitalocean.com/community/tutorials/how-to-use-sshfs-to-mount-remote-file-systems-over-ssh)._\n\u003c/details\u003e\n\n### Remote Development\n\nThe workspace can be integrated and used as a remote runtime (also known as remote kernel/machine/interpreter) for a variety of popular development tools and IDEs, such as Jupyter, VS Code, PyCharm, Colab, or Atom Hydrogen. Thereby, you can connect your favorite development tool running on your local machine to a remote machine for code execution. This enables a local-quality development experience with remote-hosted compute resources.\n\nThese integrations usually require a passwordless SSH connection from the local machine to the workspace. To set up an SSH connection, please follow the steps explained in the [SSH Access](#ssh-access) section.\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eJupyter - Remote Kernel\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nThe workspace can be added to a Jupyter instance as a remote kernel by using the [remote_ikernel](https://bitbucket.org/tdaff/remote_ikernel/) tool. If you have installed remote_ikernel (`pip install remote_ikernel`) on your local machine, the SSH setup script of the workspace will automatically offer you the option to setup a remote kernel connection.\n\n\u003e _When running kernels on remote machines, the notebooks themselves will be saved onto the local filesystem, but the kernel will only have access to the filesystem of the remote machine running the kernel. If you need to sync data, you can make use of rsync, scp, or sshfs as explained in the [SSH Access](#ssh-access) section._\n\nIn case you want to manually setup and manage remote kernels, use the [remote_ikernel](https://bitbucket.org/tdaff/remote_ikernel/src/default/README.rst) command-line tool, as shown below:\n\n```bash\n# Change my-workspace with the name of a workspace SSH connection\nremote_ikernel manage --add \\\n    --interface=ssh \\\n    --kernel_cmd=\"ipython kernel -f {connection_file}\" \\\n    --name=\"ml-server (Python)\" \\\n    --host=\"my-workspace\"\n```\n\nYou can use the remote_ikernel command line functionality to list (`remote_ikernel manage --show`) or delete (`remote_ikernel manage --delete \u003cREMOTE_KERNEL_NAME\u003e`) remote kernel connections.\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/remote-dev-jupyter-kernel.png\"/\u003e\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eVS Code - Remote Machine\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nThe Visual Studio Code [Remote - SSH](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-ssh) extension allows you to open a remote folder on any remote machine with SSH access and work with it just as you would if the folder were on your own machine. Once connected to a remote machine, you can interact with files and folders anywhere on the remote filesystem and take full advantage of VS Code's feature set (IntelliSense, debugging, and extension support). The discovers and works out-of-the-box with passwordless SSH connections as configured by the workspace SSH setup script. To enable your local VS Code application to connect to a workspace:\n\n1. Install [Remote - SSH](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-ssh) extension inside your local VS Code.\n2. Run the SSH setup script of a selected workspace as explained in the [SSH Access](#ssh-access) section.\n3. Open the Remote-SSH panel in your local VS Code. All configured SSH connections should be automatically discovered. Just select any configured workspace connection you like to connect to as shown below:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/remote-dev-vscode.gif\"/\u003e\n\n\u003e 📖 _You can find additional features and information about the Remote SSH extension in [this guide](https://code.visualstudio.com/docs/remote/ssh)._\n\n\u003c/details\u003e\n\n### Tensorboard\n\n[Tensorboard](https://www.tensorflow.org/tensorboard) provides a suite of visualization tools to make it easier to understand, debug, and optimize your experiment runs. It includes logging features for scalar, histogram, model structure, embeddings, and text \u0026 image visualization. The workspace comes pre-installed with [jupyter_tensorboard extension](https://github.com/lspvic/jupyter_tensorboard) that integrates Tensorboard into the Jupyter interface with functionalities to start, manage, and stop instances. You can open a new instance for a valid logs directory, as shown below:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/tensorboard-open.png\" /\u003e\n\nIf you have opened a Tensorboard instance in a valid log directory, you will see the visualizations of your logged data:\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/tensorboard-dashboard.png\" /\u003e\n\n\u003e _Tensorboard can be used in combination with many other ML frameworks besides Tensorflow. By using the [tensorboardX](https://github.com/lanpa/tensorboardX) library you can log basically from any python based library. Also, PyTorch has a direct Tensorboard integration as described [here](https://pytorch.org/docs/stable/tensorboard.html)._\n\nIf you prefer to see the tensorboard directly within your notebook, you can make use of following **Jupyter magic**:\n\n```\n%load_ext tensorboard\n%tensorboard --logdir /workspace/path/to/logs\n```\n\n### Hardware Monitoring\n\nThe workspace provides two pre-installed web-based tools to help developers during model training and other experimentation tasks to get insights into everything happening on the system and figure out performance bottlenecks.\n\n[Netdata](https://github.com/netdata/netdata) (`Open Tool -\u003e Netdata`) is a real-time hardware and performance monitoring dashboard that visualize the processes and services on your Linux systems. It monitors metrics about CPU, GPU, memory, disks, networks, processes, and more.\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/hardware-monitoring-netdata.png\" /\u003e\n\n[Glances](https://github.com/nicolargo/glances) (`Open Tool -\u003e Glances`) is a web-based hardware monitoring dashboard as well and can be used as an alternative to Netdata.\n\n\u003cimg style=\"width: 100%\" src=\"https://github.com/ml-tooling/ml-workspace/raw/main/docs/images/features/hardware-monitoring-glances.png\"/\u003e\n\n\u003e _Netdata and Glances will show you the hardware statistics for the entire machine on which the workspace container is running._\n\n### Run as a job\n\n\u003e _A job is defined as any computational task that runs for a certain time to completion, such as a model training or a data pipeline._\n\nThe workspace image can also be used to execute arbitrary Python code without starting any of the pre-installed tools. This provides a seamless way to productize your ML projects since the code that has been developed interactively within the workspace will have the same environment and configuration when run as a job via the same workspace image.\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eRun Python code as a job via the workspace image\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nTo run Python code as a job, you need to provide a path or URL to a code directory (or script) via `EXECUTE_CODE`. The code can be either already mounted into the workspace container or downloaded from a version control system (e.g., git or svn) as described in the following sections. The selected code path needs to be python executable. In case the selected code is a directory (e.g., whenever you download the code from a VCS) you need to put a `__main__.py` file at the root of this directory. The `__main__.py` needs to contain the code that starts your job.\n\n#### Run code from version control system\n\nYou can execute code directly from Git, Mercurial, Subversion, or Bazaar by using the pip-vcs format as described in [this guide](https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support). For example, to execute code from a [subdirectory](https://github.com/ml-tooling/ml-workspace/tree/main/resources/tests/ml-job) of a git repository, just run:\n\n```bash\ndocker run --env EXECUTE_CODE=\"git+https://github.com/ml-tooling/ml-workspace.git#subdirectory=resources/tests/ml-job\" mltooling/ml-workspace:0.13.2\n```\n\n\u003e 📖 _For additional information on how to specify branches, commits, or tags please refer to [this guide](https://pip.pypa.io/en/stable/reference/pip_install/#vcs-support)._\n\n#### Run code mounted into the workspace\n\nIn the following example, we mount and execute the current working directory (expected to contain our code) into the `/workspace/ml-job/` directory of the workspace:\n\n```bash\ndocker run -v \"${PWD}:/workspace/ml-job/\" --env EXECUTE_CODE=\"/workspace/ml-job/\" mltooling/ml-workspace:0.13.2\n```\n\n#### Install Dependencies\n\nIn the case that the pre-installed workspace libraries are not compatible with your code, you can install or change dependencies by just adding one or multiple of the following files to your code directory:\n\n- `requirements.txt`: [pip requirements format](https://pip.pypa.io/en/stable/user_guide/#requirements-files) for pip-installable dependencies.\n- `environment.yml`: [conda environment file](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html?highlight=environment.yml#creating-an-environment-file-manually) to create a separate Python environment.\n- `setup.sh`: A shell script executed via `/bin/bash`.\n\nThe execution order is 1. `environment.yml` -\u003e 2. `setup.sh` -\u003e 3. `requirements.txt`\n\n#### Test job in interactive mode\n\nYou can test your job code within the workspace (started normally with interactive tools) by executing the following python script:\n\n```bash\npython /resources/scripts/execute_code.py /path/to/your/job\n```\n\n#### Build a custom job image\n\nIt is also possible to embed your code directly into a custom job image, as shown below:\n\n```dockerfile\nFROM mltooling/ml-workspace:0.13.2\n\n# Add job code to image\nCOPY ml-job /workspace/ml-job\nENV EXECUTE_CODE=/workspace/ml-job\n\n# Install requirements only\nRUN python /resources/scripts/execute_code.py --requirements-only\n\n# Execute only the code at container startup\nCMD [\"python\", \"/resources/docker-entrypoint.py\", \"--code-only\"]\n```\n\n\u003c/details\u003e\n\n### Pre-installed Libraries and Interpreters\n\nThe workspace is pre-installed with many popular interpreters, data science libraries, and ubuntu packages:\n\n- **Interpreter:** Python 3.8 (Miniconda 3), NodeJS 14, Scala, Perl 5\n- **Python libraries:** Tensorflow, Keras, Pytorch, Sklearn, XGBoost, MXNet, Theano, and [many more](https://github.com/ml-tooling/ml-workspace/tree/main/resources/libraries)\n- **Package Manager:** `conda`, `pip`, `apt-get`, `npm`, `yarn`, `sdk`, `poetry`, `gdebi`...  \n\nThe full list of installed tools can be found within the [Dockerfile](https://github.com/ml-tooling/ml-workspace/blob/main/Dockerfile).\n\n\u003e _For every minor version release, we run vulnerability, virus, and security checks within the workspace using [safety](https://pyup.io/safety/), [clamav](https://www.clamav.net/), [trivy](https://github.com/aquasecurity/trivy), and [snyk via docker scan](https://docs.docker.com/engine/scan/) to make sure that the workspace environment is as secure as possible. We are committed to fix and prevent all high- or critical-severity vulnerabilities. You can find some up-to-date reports [here](https://github.com/ml-tooling/ml-workspace/tree/main/resources/reports)._\n\n### Extensibility\n\nThe workspace provides a high degree of extensibility. Within the workspace, you have **full root \u0026 sudo privileges** to install any library or tool you need via terminal (e.g., `pip`, `apt-get`, `conda`, or `npm`). You can open a terminal by one of the following ways:\n\n- **Jupyter:** `New -\u003e Terminal`\n- **Desktop VNC:** `Applications -\u003e Terminal Emulator`\n- **JupyterLab:** `File -\u003e New -\u003e Terminal`\n- **VS Code:** `Terminal -\u003e New Terminal`\n\nAdditionally, pre-installed tools such as Jupyter, JupyterLab, and Visual Studio Code each provide their own rich ecosystem of extensions. The workspace also contains a [collection of installer scripts](https://github.com/ml-tooling/ml-workspace/tree/main/resources/tools) for many commonly used development tools or libraries (e.g., `PyCharm`, `Zeppelin`, `RStudio`, `Starspace`). You can find and execute all tool installers via `Open Tool -\u003e Install Tool`. Those scripts can be also executed from the Desktop VNC (double-click on the script within the `Tools` folder on the Desktop VNC).\n\n\u003cdetails\u003e\n\u003csummary\u003eExample (click to expand...)\u003c/summary\u003e\n\nFor example, to install the [Apache Zeppelin](https://zeppelin.apache.org/) notebook server, simply execute:\n\n```bash\n/resources/tools/zeppelin.sh --port=1234\n```\n\nAfter installation, refresh the Jupyter website and the Zeppelin tool will be available under `Open Tool -\u003e Zeppelin`. Other tools might only be available within the Desktop VNC (e.g., `atom` or `pycharm`) or do not provide any UI (e.g., `starspace`, `docker-client`).\n\u003c/details\u003e\n\nAs an alternative to extending the workspace at runtime, you can also customize the workspace Docker image to create your own flavor as explained in the [FAQ](#faq) section.\n\n---\n\n\u003cbr\u003e\n\n## FAQ\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow to customize the workspace image (create your own flavor)?\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nThe workspace can be extended in many ways at runtime, as explained [here](#extensibility). However, if you like to customize the workspace image with your own software or configuration, you can do that via a Dockerfile as shown below:\n\n```dockerfile\n# Extend from any of the workspace versions/flavors\nFROM mltooling/ml-workspace:0.13.2\n\n# Run you customizations, e.g.\nRUN \\\n    # Install r-runtime, r-kernel, and r-studio web server from provided install scripts\n    /bin/bash $RESOURCES_PATH/tools/r-runtime.sh --install \u0026\u0026 \\\n    /bin/bash $RESOURCES_PATH/tools/r-studio-server.sh --install \u0026\u0026 \\\n    # Cleanup Layer - removes unneccessary cache files\n    clean-layer.sh\n```\n\nFinally, use [docker build](https://docs.docker.com/engine/reference/commandline/build/) to build your customized Docker image.\n\n\u003e 📖 _For a more comprehensive Dockerfile example, take a look at the [Dockerfile of the R-flavor](https://github.com/ml-tooling/ml-workspace/blob/main/r-flavor/Dockerfile)._\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow to update a running workspace container?\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nTo update a running workspace instance to a more recent version, the running Docker container needs to be replaced with a new container based on the updated workspace image.\n\nAll data within the workspace that is not persisted to a mounted volume will be lost during this update process. As mentioned in the [persist data](#Persist-Data) section, a volume is expected to be mounted into the `/workspace` folder. All tools within the workspace are configured to make use of the `/workspace` folder as the root directory for all source code and data artifacts. During an update, data within other directories will be removed, including installed/updated libraries or certain machine configurations. We have integrated a backup and restore feature (`CONFIG_BACKUP_ENABLED`) for various selected configuration files/folders, such as the user's Jupyter/VS-Code configuration, `~/.gitconfig`, and `~/.ssh`.\n\n\u003cdetails\u003e\n\n\u003csummary\u003eUpdate Example (click to expand...)\u003c/summary\u003e\n\nIf the workspace is deployed via Docker (Kubernetes will have a different update process), you need to remove the existing container (via `docker rm`) and start a new one (via `docker run`) with the newer workspace image. Make sure to use the same configuration, volume, name, and port. For example, a workspace (image version `0.8.7`) was started with this command:\n```\ndocker run -d \\\n    -p 8080:8080 \\\n    --name \"ml-workspace\" \\\n    -v \"/path/on/host:/workspace\" \\\n    --env AUTHENTICATE_VIA_JUPYTER=\"mytoken\" \\\n    --restart always \\\n    mltooling/ml-workspace:0.8.7\n```\nand needs to be updated to version `0.9.1`, you need to:\n\n1. Stop and remove the running workspace container: `docker stop \"ml-workspace\" \u0026\u0026 docker rm \"ml-workspace\"`\n2. Start a new workspace container with the newer image and same configuration: `docker run -d -p 8080:8080 --name \"ml-workspace\" -v \"/path/on/host:/workspace\" --env AUTHENTICATE_VIA_JUPYTER=\"mytoken\" --restart always mltooling/ml-workspace:0.9.1`\n\n\u003c/details\u003e\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow to configure the VNC server?\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nIf you want to directly connect to the workspace via a VNC client (not using the [noVNC webapp](#desktop-gui)), you might be interested in changing certain VNC server configurations. To configure the VNC server, you can provide/overwrite the following environment variables at container start (via docker run option: `--env`):\n\n\u003ctable\u003e\n    \u003ctr\u003e\n        \u003cth\u003eVariable\u003c/th\u003e\n        \u003cth\u003eDescription\u003c/th\u003e\n        \u003cth\u003eDefault\u003c/th\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eVNC_PW\u003c/td\u003e\n        \u003ctd\u003ePassword of VNC connection. This password only needs to be secure if the VNC server is directly exposed. If it is used via noVNC, it is already protected based on the configured authentication mechanism.\u003c/td\u003e\n        \u003ctd\u003evncpassword\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eVNC_RESOLUTION\u003c/td\u003e\n        \u003ctd\u003eDefault desktop resolution of VNC connection. When using noVNC, the resolution will be dynamically adapted to the window size.\u003c/td\u003e\n        \u003ctd\u003e1600x900\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n        \u003ctd\u003eVNC_COL_DEPTH\u003c/td\u003e\n        \u003ctd\u003eDefault color depth of VNC connection.\u003c/td\u003e\n        \u003ctd\u003e24\u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow to use a non-root user within the workspace?\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nUnfortunately, we currently do not support using a non-root user within the workspace. We plan to provide this capability and already started with some refactoring to allow this configuration. However, this still requires a lot more work, refactoring, and testing from our side.\n\nUsing root-user (or users with sudo permission) within containers is generally not recommended since, in case of system/kernel vulnerabilities, a user might be able to break out of the container and be able to access the host system. Since it is not very common to have such problematic kernel vulnerabilities, the risk of a severe attack is quite minimal. As explained in the [official Docker documentation](https://docs.docker.com/engine/security/security/#linux-kernel-capabilities), containers (even with root users) are generally quite secure in preventing a breakout to the host. And compared to many other container use-cases, we actually want to provide the flexibility to the user to have control and system-level installation permissions within the workspace container.\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow to create and use a virtual environment?\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nThe workspace comes preinstalled with various common tools to create isolated Python environments (virtual environments). The following sections provide a quick-intro on how to use these tools within the workspace. You can find information on when to use which tool [here](https://stackoverflow.com/a/41573588). Please refer to the documentation of the given tool for additional usage information.\n\n**venv** (recommended):\n\nTo create a virtual environment via [venv](https://docs.python.org/3/tutorial/venv.html), execute the following commands:\n\n```bash\n# Create environment in the working directory\npython -m venv my-venv\n# Activate environment in shell\nsource ./my-venv/bin/activate\n# Optional: Create Jupyter kernel for this environment\npip install ipykernel\npython -m ipykernel install --user --name=my-venv --display-name=\"my-venv ($(python --version))\"\n# Optional: Close enviornment session\ndeactivate\n```\n\n**pipenv** (recommended):\n\nTo create a virtual environment via [pipenv](https://pipenv.pypa.io/en/latest/), execute the following commands:\n\n```bash\n# Create environment in the working directory\npipenv install\n# Activate environment session in shell\npipenv shell\n# Optional: Create Jupyter kernel for this environment\npipenv install ipykernel\npython -m ipykernel install --user --name=my-pipenv --display-name=\"my-pipenv ($(python --version))\"\n# Optional: Close environment session\nexit\n```\n\n**virtualenv**:\n\nTo create a virtual environment via [virtualenv](https://virtualenv.pypa.io/en/latest/), execute the following commands:\n\n```bash\n# Create environment in the working directory\nvirtualenv my-virtualenv\n# Activate environment session in shell\nsource ./my-virtualenv/bin/activate\n# Optional: Create Jupyter kernel for this environment\npip install ipykernel\npython -m ipykernel install --user --name=my-virtualenv --display-name=\"my-virtualenv ($(python --version))\"\n# Optional: Close environment session\ndeactivate\n```\n\n**conda**:\n\nTo create a virtual environment via [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html), execute the following commands:\n\n```bash\n# Create environment (globally)\nconda create -n my-conda-env\n# Activate environment session in shell\nconda activate my-conda-env\n# Optional: Create Jupyter kernel for this environment\npython -m ipykernel install --user --name=my-conda-env --display-name=\"my-conda-env ($(python --version))\"\n# Optional: Close environment session\nconda deactivate\n```\n\n**Tip: Shell Commands in Jupyter Notebooks:**\n\nIf you install and use a virtual environment via a dedicated Jupyter Kernel and use shell commands within Jupyter (e.g. `!pip install matplotlib`), the wrong python/pip version will be used. To use the python/pip version of the selected kernel, do the following instead:\n\n```python\nimport sys\n!{sys.executable} -m pip install matplotlib\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eHow to install a different Python version?\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nThe workspace provides three easy options to install different Python versions alongside the main Python instance: [pyenv](https://github.com/pyenv/pyenv), [pipenv](https://pipenv.pypa.io/en/latest/cli/) (recommended), [conda](https://github.com/pyenv/pyenv).\n\n**pipenv** (recommended):\n\nTo install a different python version (e.g. `3.7.8`) within the workspace via [pipenv](https://pipenv.pypa.io/en/latest/cli/), execute the following commands:\n\n```bash\n# Install python vers\npipenv install --python=3.7.8\n# Activate environment session in shell\npipenv shell\n# Check python installation\npython --version\n# Optional: Create Jupyter kernel for this environment\npipenv install ipykernel\npython -m ipykernel install --user --name=my-pipenv --display-name=\"my-pipenv ($(python --version))\"\n# Optional: Close environment session\nexit\n```\n\n**pyenv**:\n\nTo install a different python version (e.g. `3.7.8`) within the workspace via [pyenv](https://github.com/pyenv/pyenv), execute the following commands:\n\n```bash\n# Install python version\npyenv install 3.7.8\n# Make globally accessible\npyenv global 3.7.8\n# Activate python version in shell\npyenv shell 3.7.8\n# Check python installation\npython3.7 --version\n# Optional: Create Jupyter kernel for this python version\npython3.7 -m pip install ipykernel\npython3.7 -m ipykernel install --user --name=my-pyenv-3.7.8 --display-name=\"my-pyenv (Python 3.7.8)\"\n```\n\n**conda**:\n\nTo install a different python version (e.g. `3.7.8`) within the workspace via [conda](https://github.com/pyenv/pyenv), execute the following commands:\n\n```bash\n# Create environment with python version\nconda create -n my-conda-3.7 python=3.7.8\n# Activate environment session in shell\nconda activate my-conda-3.7\n# Check python installation\npython --version\n# Optional: Create Jupyter kernel for this python version\npip install ipykernel\npython -m ipykernel install --user --name=my-conda-3.7 --display-name=\"my-conda ($(python --version))\"\n# Optional: Close environment session\nconda deactivate\n```\n\n**Tip: Shell Commands in Jupyter Notebooks:**\n\nIf you install and use another Python version via a dedicated Jupyter Kernel and use shell commands within Jupyter (e.g. `!pip install matplotlib`), the wrong python/pip version will be used. To use the python/pip version of the selected kernel, do the following instead:\n\n```python\nimport sys\n!{sys.executable} -m pip install matplotlib\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eCan I publish any other than the default port to access a tool inside the container?\u003c/b\u003e (click to expand...)\u003c/summary\u003e\nYou can do this, but please be aware that this port is \u003cb\u003enot\u003c/b\u003e protected by the workspace's authentication mechanism then! For security reasons, we therefore highly recommend to use the \u003ca href=\"#access-ports\"\u003eAccess Ports\u003c/a\u003e functionality of the workspace.\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eSystem and Tool Translations\u003c/b\u003e (click to expand...)\u003c/summary\u003e\nIf you want to configure another language than English in your workspace and some tools are not translated properly, have a look \u003ca href=\"https://github.com/ml-tooling/ml-workspace/issues/70#issuecomment-841863145\"\u003eat this issue\u003c/a\u003e. Try to comment out the 'exclude translations' line in `/etc/dpkg/dpkg.cfg.d/excludes` and re-install / configure the package.\n\u003c/details\u003e\n\n---\n\n\u003cbr\u003e\n\n## Known Issues\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cb\u003eToo small shared memory might crash tools or scripts\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nCertain desktop tools (e.g., recent versions of [Firefox](https://github.com/jlesage/docker-firefox#increasing-shared-memory-size)) or libraries (e.g., Pytorch - see Issues: [1](https://github.com/pytorch/pytorch/issues/2244), [2](https://github.com/pytorch/pytorch/issues/1355)) might crash if the shared memory size (`/dev/shm`) is too small. The default shared memory size of Docker is 64MB, which might not be enough for a few tools. You can provide a higher shared memory size via the `shm-size` docker run option:\n\n```bash\ndocker run --shm-size=2G mltooling/ml-workspace:0.13.2\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cb\u003eMultiprocessing code is unexpectedly slow \u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nIn general, the performance of running code within Docker is [nearly identical](https://stackoverflow.com/questions/21889053/what-is-the-runtime-performance-cost-of-a-docker-container) compared to running it directly on the machine. However, in case you have limited the container's CPU quota (as explained in [this section](#limit-memory--cpu)), the container can still see the full count of CPU cores available on the machine and there is no technical way to prevent this. Many libraries and tools will use the full CPU count (e.g., via `os.cpu_count()`) to set the number of threads used for multiprocessing/-threading. This might cause the program to start more threads/processes than it can efficiently handle with the available CPU quota, which can tremendously slow down the overall performance. Therefore, it is important to set the available CPU count or the maximum number of threads explicitly to the configured CPU quota. The workspace provides capabilities to detect the number of available CPUs automatically, which are used to configure a variety of common libraries via environment variables such as `OMP_NUM_THREADS` or `MKL_NUM_THREADS`. It is also possible to explicitly set the number of available CPUs at container startup via the `MAX_NUM_THREADS` environment variable (see [configuration section](https://github.com/ml-tooling/ml-workspace#configuration-options)). The same environment variable can also be used to get the number of available CPUs at runtime.\n\nEven though the automatic configuration capabilities of the workspace will fix a variety of inefficiencies, we still recommend configuring the number of available CPUs with all libraries explicitly. For example:\n\n```python\nimport os\nMAX_NUM_THREADS = int(os.getenv(\"MAX_NUM_THREADS\"))\n\n# Set in pytorch\nimport torch\ntorch.set_num_threads(MAX_NUM_THREADS)\n\n# Set in tensorflow\nimport tensorflow as tf\nconfig = tf.ConfigProto(\n    device_count={\"CPU\": MAX_NUM_THREADS},\n    inter_op_parallelism_threads=MAX_NUM_THREADS,\n    intra_op_parallelism_threads=MAX_NUM_THREADS,\n)\ntf_session = tf.Session(config=config)\n\n# Set session for keras\nimport keras.backend as K\nK.set_session(tf_session)\n\n# Set in sklearn estimator\nfrom sklearn.linear_model import LogisticRegression\nLogisticRegression(n_jobs=MAX_NUM_THREADS).fit(X, y)\n\n# Set for multiprocessing pool\nfrom multiprocessing import Pool\n\nwith Pool(MAX_NUM_THREADS) as pool:\n    results = pool.map(lst)\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\n\u003csummary\u003e\u003cb\u003eNginx terminates with SIGILL core dumped error\u003c/b\u003e (click to expand...)\u003c/summary\u003e\n\nIf you encounter the following error within the container logs when starting the workspace, it will most likely not be possible to run the workspace on your hardware:\n\n```\nexited: nginx (terminated by SIGILL (core dumped); not expected)\n```\n\nThe OpenResty/Nginx binary package used within the workspace requires to run on a CPU with `SSE4.2` support (see [this issue](https://github.com/openresty/openresty/issues/267#issuecomment-309296900)). Unfortunately, some older CPUs do not have support for `SSE4.2` and, therefore, will not be able to run the workspace container. On Linux, you can check if your CPU supports `SSE4.2` when looking into the `cat /proc/cpuinfo` flags section. If you encounter this problem, feel free to notify us by commenting on the following issue: [#30](https://github.com/ml-tooling/ml-workspace/issues/30).\n\n\u003c/details\u003e\n\n---\n\n\u003cbr\u003e\n\n## Contribution\n\n- Pull requests are encouraged and always welcome. Read our [contribution guidelines](https://github.com/ml-tooling/ml-workspace/tree/main/CONTRIBUTING.md) and check out [help-wanted](https://github.com/ml-tooling/ml-workspace/issues?utf8=%E2%9C%93\u0026q=is%3Aopen+is%3Aissue+label%3A\"help+wanted\"+sort%3Areactions-%2B1-desc+) issues.\n- Submit Github issues for any [feature request and enhancement](https://github.com/ml-tooling/ml-workspace/issues/new?assignees=\u0026labels=feature\u0026template=02_feature-request.md\u0026title=), [bugs](https://github.com/ml-tooling/ml-workspace/issues/new?assignees=\u0026labels=bug\u0026template=01_bug-report.md\u0026title=), or [documentation](https://github.com/ml-tooling/ml-workspace/issues/new?assignees=\u0026labels=documentation\u0026template=03_documentation.md\u0026title=) problems.\n- By participating in this project, you agree to abide by its [Code of Conduct](https://github.com/ml-tooling/ml-workspace/blob/main/.github/CODE_OF_CONDUCT.md).\n- The [development section](#development) below contains information on how to build and test the project after you have implemented some changes.\n\n## Development\n\n\u003e _**Requirements**: [Docker](https://docs.docker.com/get-docker/) and [Act](https://github.com/nektos/act#installation) are required to be installed on your machine to execute the build process._\n\nTo simplify the process of building this project from scratch, we provide build-scripts - based on [universal-build](https://github.com/ml-tooling/universal-build) - that run all necessary steps (build, test, and release) within a containerized environment. To build and test your changes, execute the following command in the project root folder:\n\n```bash\nact -b -j build\n```\n\nUnder the hood it uses the build.py files in this repo based on the [universal-build library](https://github.com/ml-tooling/universal-build). So, if you want to build it locally, you can also execute this command in the project root folder to build the docker container:\n\n```bash\npython build.py --make\n```\n\nFor additional script options:\n\n```bash\npython build.py --help\n```\n\nRefer to our [contribution guides](https://github.com/ml-tooling/ml-workspace/blob/main/CONTRIBUTING.md#development-instructions) for more detailed information on our build scripts and development process.\n\n---\n\nLicensed **Apache 2.0**. Created and maintained with ❤️\u0026nbsp; by developers from Berlin.\n","funding_links":[],"categories":["Uncategorized","Tools/Utilities","The Data Science Toolbox","Jupyter Notebook","Notebook Environments","Data Science Notebook Frameworks","Software","R Tools, Libraries, and Frameworks","Machine Learning Platform","DS Notebook","Machine Learning","Apps","Tools","其他_机器学习与深度学习","MATLAB Tools","Notebook环境","vscode","Pytorch elsewhere ｜ Pytorch相关","R Tools, Libraries and Frameworks","Researchers","Software Development","Pytorch elsewhere","Runtimes/Frontends"],"sub_categories":["Uncategorized","Miscellaneous Tools","Software Development - IDE \u0026 Tools","Development","General-Purpose Machine Learning","Misc","Other libraries｜其他库:","viii. Linear Regression","Mesh networks","JavaScript Libraries for Machine Learning","Tools","IDE/Tools","Other libraries:"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fml-tooling%2Fml-workspace","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fml-tooling%2Fml-workspace","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fml-tooling%2Fml-workspace/lists"}