{"id":21721352,"url":"https://github.com/informaticsmatters/squonk2-data-manager-jupyter-operator","last_synced_at":"2026-04-13T17:02:59.624Z","repository":{"id":40500833,"uuid":"369136493","full_name":"InformaticsMatters/squonk2-data-manager-jupyter-operator","owner":"InformaticsMatters","description":"A Kubernetes operator for Jupyer","archived":false,"fork":false,"pushed_at":"2024-12-20T07:48:32.000Z","size":128,"stargazers_count":0,"open_issues_count":4,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-25T18:43:18.995Z","etag":null,"topics":["k8s-operator","squonk2"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/InformaticsMatters.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-05-20T08:28:56.000Z","updated_at":"2024-12-20T07:48:35.000Z","dependencies_parsed_at":"2024-12-04T11:20:54.490Z","dependency_job_id":"822dd7fe-2f13-4966-844e-bc4662717c56","html_url":"https://github.com/InformaticsMatters/squonk2-data-manager-jupyter-operator","commit_stats":null,"previous_names":[],"tags_count":40,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InformaticsMatters%2Fsquonk2-data-manager-jupyter-operator","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InformaticsMatters%2Fsquonk2-data-manager-jupyter-operator/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InformaticsMatters%2Fsquonk2-data-manager-jupyter-operator/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/InformaticsMatters%2Fsquonk2-data-manager-jupyter-operator/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/InformaticsMatters","download_url":"https://codeload.github.com/InformaticsMatters/squonk2-data-manager-jupyter-operator/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244693751,"owners_count":20494503,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["k8s-operator","squonk2"],"created_at":"2024-11-26T02:16:00.054Z","updated_at":"2026-04-13T17:02:59.618Z","avatar_url":"https://github.com/InformaticsMatters.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# A Jupyter Application Operator (for the Data Manager API)\n\n[![Data Manager: Application](https://img.shields.io/badge/squonk2%20data%20manager-application-000000?labelColor=dc332e)]()\n[![Dev Stage: 1](https://img.shields.io/badge/dev%20stage-★☆☆%20%281%29-000000?labelColor=dc332e)](https://github.com/InformaticsMatters/code-repository-development-stages)\n\n![Architecture](https://img.shields.io/badge/architecture-amd64%20%7C%20arm64-lightgrey)\n\n[![build](https://github.com/informaticsmatters/squonk2-data-manager-jupyter-operator/actions/workflows/build.yaml/badge.svg)](https://github.com/informaticsmatters/squonk2-data-manager-jupyter-operator/actions/workflows/build.yaml)\n[![build tag](https://github.com/informaticsmatters/squonk2-data-manager-jupyter-operator/actions/workflows/build-tag.yaml/badge.svg)](https://github.com/informaticsmatters/squonk2-data-manager-jupyter-operator/actions/workflows/build-tag.yaml)\n\n![GitHub](https://img.shields.io/github/license/informaticsmatters/squonk2-data-manager-jupyter-operator)\n\n![GitHub tag (latest SemVer pre-release)](https://img.shields.io/github/v/tag/informaticsmatters/squonk2-data-manager-jupyter-operator?include_prereleases)\n\n[![Conventional Commits](https://img.shields.io/badge/Conventional%20Commits-1.0.0-yellow.svg)](https://conventionalcommits.org)\n[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit\u0026logoColor=white)](https://github.com/pre-commit/pre-commit)\n[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\nThis repo contains a Kubernetes _Operator_ based on the [kopf] and [kubernetes]\nPython packages that is used by the **Informatics Matters Squonk2 Data Manager API**\nto create Jupyter Notebooks for the Data Manager service.\n\nThe operator's Custom Resource Definition (CRD) can be found in\n`roles/operator/files`.\n\nBy default, the operator creates instances using the Jupyter image: -\n\n-   `jupyter/minimal-notebook:notebook-6.3.0` (see `handlers.py`)\n\nPrerequisites: -\n\n-   Python\n-   Docker\n-   A kubernetes config file\n-   A compatible Kubernetes (e.g. 1.32 thru 1.34 if the operator is built for 1.33)\n\n## Contributing\nThe project uses: -\n\n- [pre-commit] to enforce linting of files prior to committing them to the\n  upstream repository\n- [Commitizen] to enforce a [Conventional Commit] commit message format\n- [Black] as a code formatter\n\nYou **MUST** comply with these choices in order to  contribute to the project.\n\nTo get started review the pre-commit utility and the conventional commit style\nand then set-up your local clone by following the **Installation** and\n**Quick Start** sections: -\n\n    pip install -r build-requirements.txt\n    pre-commit install -t commit-msg -t pre-commit\n\nNow the project's rules will run on every commit, and you can check the\ncurrent health of your clone with: -\n\n    pre-commit run --all-files\n\n## Building the operator (local development)\nPre-requisites: -\n\n- Docker Compose (v2)\n\nThe operator container, residing in the `operator` directory,\nis automatically built and pushed to Docker Hub using GitHub Actions.\n\nYou can build and push the image yourself using docker-compose.\nThe following will build an operator image with a specific tag: -\n\n    export IMAGE_TAG=34.0.0-alpha.1\n    docker compose build\n    docker compose push\n\n## Deploying into the Data Manager API\nWe use [Ansible] 3 and community modules in [Ansible Galaxy] as the deployment\nmechanism, using the `operator` Ansible role in this repository and a\nKubernetes config (KUBECONFIG). All of this is done via a suitable Python\nenvironment using the requirements in the root of the project...\n\n    python -m venv venv\n    source venv/bin/activate\n    pip install --upgrade pip\n    pip install -r requirements.txt\n\nSet your KUBECONFIG for the cluster and verify its right: -\n\n    export KUBECONFIG=~/k8s-config/local-config\n    kubectl get no\n    [...]\n\nNow, create a parameter file (i.e. `parameters.yaml`) based on the project's\n`example-parameters.yaml`, setting values for the operator that match your\nneeds. Then deploy, using Ansible, from the root of the project: -\n\n    export PARAMS=parameters\n    ansible-playbook -e @${PARAMS}.yaml site.yaml\n\nThat deploys the operator and its CRD to your chosen operator namespace.\nTo deploy the Data Manager RBAC and Jupyter notebook configuration objects\nyou need to run the `site_dm.yaml` playbook: -\n\n    ansible-playbook -e @${PARAMS}.yaml site_dm.yaml\n\n\u003e   If deploying to multiple Data Managers you should just need one operator\n    and then deploy RBACs to each DM namespace. Remember to also adjust the\n    annotations of for CRD so each DM namespace recognises it as a valid\n    application.\n\nTo remove the operator (assuming there are no operator-derived instances)...\n\n    ansible-playbook -e @${PARAMS}.yaml -e jo_state=absent site.yaml\n\n\u003e   The current Data Manager API assumes that once an Application (operator)\n    has been installed it is not removed. So, removing the operator here\n    is described simply to illustrate a 'clean-up' - you would not\n    normally remove an Application operator in a production environment.\n\n### Deploying to the official cluster\nThe parameters used to deploy the operator to our 'official' cluster\nare held in this repository.\n\nTo deploy the operator itself run the main 'site' playbook with\na suitable set of parameters: -\n\n    export KUBECONFIG=~/k8s-config/config-aws-im-main-eks\n    export PARAMS=staging\n    ansible-playbook -e @${PARAMS}-parameters.yaml site.yaml\n\nThen, you must run the `site_dm` playbook to for each Data Manager\nyou wish to configure: -\n\n    ansible-playbook -e xchem-dev-integration-parameters.yaml site_dm.yaml\n\n    ansible-playbook -e xchem-dev-test-parameters.yaml site_dm.yaml\n\nThis will install the RBAC and configuration objects for Jupyter\nto the corresponding DM namespaces.\n\n# Data Manager Application Compliance\nIn order to expose the CRD as an _Application_ in the Data Manager API service\nyou will need to a) annotate the CRD and b) provide a **Role** and\n**RoleBinding**.\n\n## Custom Resource Definition (CRD) annotations\nFor the **CRD** to be recognised by the Data Manager API it wil need a number of\nannotations, located in its `metadata -\u003e annotations` block.\nYou will need: -\n\n-   An annotation `data-manager.informaticsmatters.com/application`\n    set to `'yes'`\n-   An annotation `data-manager.informaticsmatters.com/application-namespaces`\n    set to a colon-separated list of namespaces the Application is to be used\n    in. e.g `'data-manager-api:data-manager-api-staging'`\n-   An annotation `data-manager.informaticsmatters.com/application-url-location`.\n    The url location is the 'status'-relative path in the custom resource\n    'status' block where the application URL can be extracted. A value of\n    `jupyter.notebook.url` would imply that the Application URL\n    can be found in the custom resource object using the Python dictionary\n    reference: `custom_resource['status']['jupyter']['notebook']['url']`.\n\n\u003e   Our CRD already contains suitable annotations\n    (see `roles/operator/files/crd.yaml`), so there's nothing more to\n    do here once you've deployed it (using Ansible in our case,\n    as described earlier).\n\n## Pod labels\nSo that **Pod** instances can be recognised by the Data Manager API the\napplication's **Pod** (only one if there are many) must contain the following\nlabel: -\n\n    data-manager.informaticsmatters.com/instance\n\nWhich must have a value that matched the `name` given to the operator\nby the Data Manager. The name is a unique reference for the application\ninstance.\n\n\u003e   See the `spec.template.metadata.labels` block in the `deployment_body`\n    section of the `create()` function in our `operator/handlers.py`.\n\n## Role and RoleBinding definitions\nAs well as providing RBAC for the Operator you will need a **Role** and\n**RoleBinding** to allow the Data Manager to execute the Operator. These must\nallow the Data Manager to launch instances of the Custom Resource in the\nData Manager's **Namespace**.\n\nTypical **Role** and **RoleBinding** definitions are provided in this\nrepository. Once you define yours you'll just need to create them: -\n\n    kubectl create -f data-manager-rbac.yaml\n\nWith this done the application should be visible through the Data Manager API's\n**/application** REST endpoint.\n\n## Security context\nThe Custom Resource must expose properties that allow a custom\n**SecurityContext** to be applied. If not, the application instance will not be\nable to access the Data Manager Project files. The Data-Manager API will\nexpect to provide the following properties through the **CRD** schema's: -\n\n-   `spec.securityContext.runAsUser`\n-   `spec.securityContext.runAsGroup`\n\nTo run successfully the container must be able to run without privileges\nand run using a user and group that is assigned by the Data Manager API.\n\n\u003e   See our handling of these values in the `create()` function\n    of our `operator/handlers.py` and their definitions\n    in `roles/operator/files/crd.yaml`\n\n## Storage volume\nIn order to place Data-Manager Project files the **CRD** must\nexpose the following properties through its schema's: -\n\n-   `spec.project.claimName`\n-   `spec.project.id`\n\nThese will be expected to provide a suitable volume mount within the\napplication **Pod** for the Project files.\n\n\u003e   See our use of these values in `roles/operator/files/crd.yaml`.\n\n## Instance certificate variables\nApplications can use the DM-API ingress, if they use path-based routing,\nand are happy to share the DM-API domain. Doing this means you won't need\na separate TLS certificate, instead using the Data Manager's.\n\nThe Jupyter operator supports this vis a Pod environment variable that is\nset if you provide a value for the Ansible playbook variable\n`jo_ingress_tls_secret`. If left blank the operator will expect to use the\nKubernetes [Certificate Manager], where you are expected to provide the\ncertificate issuer name using the playbook variable `jo_ingress_cert_issuer`.\n\nBoth are exposed in the example parameter file `example-parameters.yaml`.\n\n## Populating the home directory\nA number of key files are prepared by the built-in `/usr/local/bin/start.sh` script\nthat the operator creates (via a **ConfigMap**). This script, used as the\ncontainer's **command**, will also recursively copy the content of the container image's\n`/home/code/copy-to-startup` directory (if it exists) to the parent of the\n`$HOME` (`~`) directory (the parent of the instance directory) prior to\nrunning Jupyter.\n\nThe script copies files using the command: -\n\n    cp -r -u /home/code/copy-to-startup/* ~/..\n\n---\n\n[ansible]: https://www.ansible.com\n[ansible galaxy]: https://galaxy.ansible.com\n[black]: https://black.readthedocs.io/en/stable\n[certificate manager]: https://cert-manager.io/docs/installation/kubernetes/\n[commitizen]: https://commitizen-tools.github.io/commitizen/\n[conventional commit]: https://www.conventionalcommits.org/en/v1.0.0/\n[kopf]: https://pypi.org/project/kopf/\n[kubernetes]: https://pypi.org/project/kubernetes/\n[pre-commit]: https://pre-commit.com\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finformaticsmatters%2Fsquonk2-data-manager-jupyter-operator","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Finformaticsmatters%2Fsquonk2-data-manager-jupyter-operator","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Finformaticsmatters%2Fsquonk2-data-manager-jupyter-operator/lists"}