{"id":19746403,"url":"https://github.com/astrolabsoftware/fink-k8s","last_synced_at":"2026-03-05T18:31:32.457Z","repository":{"id":60832101,"uuid":"289915637","full_name":"astrolabsoftware/fink-k8s","owner":"astrolabsoftware","description":"Host files and procedure for running Fink on Kubernetes","archived":false,"fork":false,"pushed_at":"2023-01-26T14:38:08.000Z","size":37505,"stargazers_count":3,"open_issues_count":0,"forks_count":2,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-05-07T18:22:33.367Z","etag":null,"topics":["apache-spark","fink","kubernetes"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/astrolabsoftware.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-08-24T12:01:14.000Z","updated_at":"2022-10-20T07:46:00.000Z","dependencies_parsed_at":"2023-02-14T18:16:06.393Z","dependency_job_id":null,"html_url":"https://github.com/astrolabsoftware/fink-k8s","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astrolabsoftware%2Ffink-k8s","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astrolabsoftware%2Ffink-k8s/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astrolabsoftware%2Ffink-k8s/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/astrolabsoftware%2Ffink-k8s/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/astrolabsoftware","download_url":"https://codeload.github.com/astrolabsoftware/fink-k8s/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224202818,"owners_count":17272807,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-spark","fink","kubernetes"],"created_at":"2024-11-12T02:14:28.864Z","updated_at":"2025-10-12T01:45:06.044Z","avatar_url":"https://github.com/astrolabsoftware.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DEPRECATED\n\nThis is no longer supported, please consider using:\n- https://github.com/astrolabsoftware/k8s-spark-py (build Spark image for k8s)\n- https://github.com/astrolabsoftware/fink-broker (run fink-broker on k8s)\ninstead.\n\n# Using Fink on Kubernetes engine\n\nThis repository hosts files and procedure to run [Fink](https://github.com/astrolabsoftware/fink-broker) on Kubernetes.\n\n## Continuous integration for master branch\n\nBuild Fink and run Fink integration tests on Kubernetes\n\n| CI       | Status                                                                                                                                                           | Image build  | e2e tests | Documentation generation        | Static code analysis  | Image security scan |\n|----------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-----------|---------------------------------|-----------------------|---------------------|\n| GHA      | [![Fink CI](https://github.com/astrolabsoftware/fink-k8s/workflows/CI/badge.svg?branch=master)](https://github.com/astrolabsoftware/fink-k8s/actions?query=workflow%3A\"CI\") | Yes          | Yes        | No | No                   | Yes                 |\n\n\n## Compatibility matrix and images\n\nYou can already test Fink on Kubernetes using our [official images](https://hub.docker.com/r/julienpeloton/fink/tags). We summarised below the versions that have been tested:\n\n| Fink version | Spark version | Kubernetes version| Image       | Status      |\n|--------------|---------------|-------------------|-------------|-------------|\n| 2.4          | 3.1.3         | 1.18              | julienpeloton/finkk8sdev:2.4_3.1.3 | production  |\n| 0.7.0        | 2.4.4         | 1.20              | julienpeloton/fink:0.7.0_2.4.4 | deprecated  |\n\nYou can try other combinations, but there is no guarantee that it works. You would simply use:\n\n```bash\nspark-submit --master $MASTERURL \\\n     --deploy-mode cluster \\\n     --conf spark.kubernetes.container.image=julienpeloton/finkk8sdev:2.4_3.1.3 \\\n     $OTHERCONF \\\n     /home/fink-broker/bin/stream2raw.py \\\n     $ARGS\n```\n\nSee below for a full working example.\n\n## Pre-requisites\n\nClone the source code:\n\n```bash\ngit clone https://github.com/astrolabsoftware/fink-k8s.git\nFINKKUB=$PWD/fink-k8s\n```\n\nAll configuration parameters are available in `$FINKKUB/conf.sh`.\n\n## Kubernetes cluster installation\n\nInformation to install Kubernetes can found in the official documentation. Alternatively for test purposes, you can install minikube and run a local Kubernetes cluster.\n\n### Start a Kubernetes cluster with minikube\n\nMinikube installation (see https://minikube.sigs.k8s.io/docs/start) and startup are performed by the following command:\n\n```bash\n# Eventually edit some parameters in the Minikube section of $FINKKUB/conf.sh\n$FINKKUB/bin/run-minikube.sh\n```\n\nSee the compatibility matrix above to set your Kubernetes version correctly. We recommend to run 1.25+ for the moment. If you intend to run Fink with Spark 2.4.x (not recommended), then you need to stick with Kubernetes version 1.15 maximum (see [here](https://issues.apache.org/jira/browse/SPARK-31786) and [there](https://github.com/apache/spark/pull/28625)).\n\nNote that it is recommended to set at least 4 CPUs and somehow a large fraction of RAM (around 7GB).\n\n#### Troubleshooting\n\nOn recent Ubuntu (22.04), with latest Docker (20.10+), with a fresh minikube installation, you will need at least Kubernetes version 1.20.\n\n### Manage Pods\n\nWe need to give additional rights to our Kubernetes cluster to manage pods. This is due to Spark’s architecture — we deploy a Spark Driver, which can then create the Spark Executors in pods and then clean them up once the job is done\n\n```bash\n# Set RBAC\n# see https://spark.apache.org/docs/latest/running-on-kubernetes.html#rbac\n# This step is performed by ./bin/itest.sh\nkubectl create serviceaccount spark --dry-run=client -o yaml | kubectl apply -f -\nkubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark \\\n  --namespace=default --dry-run=client -o yaml | kubectl apply -f -\n```\n\n## Build Fink docker image\n\n### Download Apache Spark\n\nFirst, you need to choose the Spark version with which you want to run Fink, by setting the `SPARK_VERSION` in `$FINKKUB/conf.sh`. See above the compatibility matrix. We recommend to use Spark 3.1.3 for the moment.\n\nThe following command will perform:\n1. Download and untat Spark in your prefered location (set `SPARK_INSTALL_DIR` in `$FINKKUB/conf.sh`)\n2. Install Fink additional files inside Spark\n\n```bash\n# Assuming Scala 2.11\n# Eventually edit some parameters in the Spark section of $FINKKUB/conf.sh\n$FINKKUB/bin/prereq-install.sh\n```\n\nWe have built an image based on `openjdk:11-jre` instead of the official `openjdk:11-jre-slim` which was not suited for our python environment (we heavily use glibc for example which is not in the alpine version). If you know how to easily build the Fink image using `openjdk:11-jre-slim`, contact us! Note that we do not release an image for R (not used), but feel free to contact us if you need it.\n\n### Build Fink image\n\nSet your docker account in the `REPO` variable in `$FINKKUB/conf.sh` if you are not using minikube, `TAG` variable contains fink version and spark version used\n\nTo build the image run:\n\n```bash\n\n# Use the `-m` option only if you are using minikube\n$FINKKUB/bin/docker-image-tool-fink.sh -m build\n```\n\nYou should end up with an image around 6GB:\n\n```bash\neval $(minikube docker-env)\ndocker images\n\nREPOSITORY                                TAG         IMAGE ID       CREATED             SIZE\ntest/finkk8sdev                           2.4_3.1.3   373da5b4af53   9 minutes ago       5.99GB\n```\n\nWe are actively working at reducing the size of the image (most of the size is taken by dependencies). If you want to use this image in production (not with minikube), you need also to push the image:\n\n```bash\n./bin/docker-image-tool-fink.sh push\n```\n\n## Examples\n\n### Ingesting stream data with Fink \u0026 Kubernetes\n\nThe first step in Fink is to listen to a stream, decode the alert, and store those alerts on disk (`stream2raw`). You would simply do this step:\n\n```bash\n# login\neval $(minikube docker-env)\n\n# get the apiserver ip\n# Beware, it is different for each new minikube k8s cluster\nkubectl cluster-info\n--\u003e Kubernetes master is running at https://127.0.0.1:32776\n--\u003e KubeDNS is running at ...\n\n# submit the job in cluster mode - 1 driver + 1 executor\nKAFKA_IPPORT=#fill me\nKAFKA_TOPIC=#fill me\nPRODUCER=sims\nFINK_ALERT_SCHEMA=/home/fink/fink-broker/schemas/1628364324215010017.avro\nKAFKA_STARTING_OFFSET=earliest\nONLINE_DATA_PREFIX=/home/fink/fink-broker/online\nFINK_TRIGGER_UPDATE=2\nLOG_LEVEL=INFO\n\nspark-submit --master k8s://https://127.0.0.1:32776 \\\n     --deploy-mode cluster \\\n     --conf spark.executor.instances=1 \\\n     --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \\\n     --conf spark.kubernetes.container.image=test/finkk8sdev:2.4_3.1.3 \\\n     --conf spark.driver.extraJavaOptions=\"-Divy.cache.dir=/home/fink -Divy.home=/home/fink\" \\\n     local:///home/fink/fink-broker/bin/stream2raw.py \\\n     -producer ${PRODUCER} \\\n     -servers ${KAFKA_IPPORT} -topic ${KAFKA_TOPIC} \\\n    -schema ${FINK_ALERT_SCHEMA} -startingoffsets_stream ${KAFKA_STARTING_OFFSET} \\\n    -online_data_prefix ${ONLINE_DATA_PREFIX} \\\n    -tinterval ${FINK_TRIGGER_UPDATE} -log_level ${LOG_LEVEL}\n```\n\nNote:\n\n- Servers are either ZTF/LSST ones (you need extra auth files), or Fink Kafka servers (replayed streams).\n- `online_data_prefix` should point to a hdfs (or s3) path in production (otherwise alerts will be collected inside the k8s cluster, and you won't access it!).\n\n\nYou can launch the process above by using:\n```bash\n./bin/itest.sh\n```\n\n\n### Monitoring your job\n\nThe UI associated with any application can be accessed locally using kubectl port-forward:\n\n```bash\nkubectl port-forward \u003cdriver-pod-name\u003e 4040:4040\n```\n\nand then navigate to `http://localhost:4040`.\n\n### Terminating the job\n\nWe are running a streaming job, so just hitting CTRl+C will not stop the job (which will continue forever in the pods). To really terminate, you need to delete the master:\n\n```bash\nkubectl delete pod \u003cdriver-pod-name\u003e\n```\n\nBeware, if you kill an executor, it will be recreated by Kubernetes.\n\n### Troubleshooting\n\nSee pods status\n\n```bash\nkubectl get pods\nNAME                                 READY   STATUS    RESTARTS   AGE\nstream2raw-py-1598515989094-driver   1/1     Running   0          28m\nstream2raw-py-1598515989094-exec-1   1/1     Running   0          27m\n```\n\nAccess logs\n\n```bash\nkubectl logs \u003cpod-driver-or-executor-name\u003e\n```\n\nBasic information about the scheduling decisions made around the driver pod\n\n```bash\nkubectl describe pod \u003cpod-driver-name\u003e\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrolabsoftware%2Ffink-k8s","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fastrolabsoftware%2Ffink-k8s","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fastrolabsoftware%2Ffink-k8s/lists"}