{"id":27939443,"url":"https://github.com/nextbreakpoint/flink-controller","last_synced_at":"2025-09-26T12:02:23.644Z","repository":{"id":54988040,"uuid":"169698273","full_name":"nextbreakpoint/flink-controller","owner":"nextbreakpoint","description":"Flink Controller implements a Kubernetes Custom Controller (aka Kubernetes Operator) for Apache Flink","archived":false,"fork":false,"pushed_at":"2024-12-20T20:01:09.000Z","size":8583,"stargazers_count":53,"open_issues_count":10,"forks_count":9,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-03-31T08:51:15.727Z","etag":null,"topics":["flink","infrastructure-as-code","kotlin","kubernetes","kubernetes-controller","kubernetes-operator","stream-processing"],"latest_commit_sha":null,"homepage":"","language":"Kotlin","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nextbreakpoint.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-08T07:19:05.000Z","updated_at":"2025-01-26T16:03:21.000Z","dependencies_parsed_at":"2024-11-17T20:44:58.143Z","dependency_job_id":"481d61fd-fe15-41cb-928a-a99c55db81c0","html_url":"https://github.com/nextbreakpoint/flink-controller","commit_stats":null,"previous_names":["nextbreakpoint/flink-controller"],"tags_count":17,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nextbreakpoint%2Fflink-controller","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nextbreakpoint%2Fflink-controller/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nextbreakpoint%2Fflink-controller/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nextbreakpoint%2Fflink-controller/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nextbreakpoint","download_url":"https://codeload.github.com/nextbreakpoint/flink-controller/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252854530,"owners_count":21814699,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["flink","infrastructure-as-code","kotlin","kubernetes","kubernetes-controller","kubernetes-operator","stream-processing"],"created_at":"2025-05-07T09:47:40.576Z","updated_at":"2025-09-26T12:02:18.604Z","avatar_url":"https://github.com/nextbreakpoint.png","language":"Kotlin","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Flink Controller\n\nFlink Controller implements a Kubernetes Custom Controller (aka Kubernetes Operator) for Apache Flink.\nThat means you can operate Flink and manage Flink applications using Kubernetes native tooling like kubectl.  \n\n## License\n\nFlink Controller is distributed under the terms of BSD 3-Clause License.\n\n    Copyright (c) 2019-2024, Andrea Medeghini\n    All rights reserved.\n\n    Redistribution and use in source and binary forms, with or without\n    modification, are permitted provided that the following conditions are met:\n\n    * Redistributions of source code must retain the above copyright notice, this\n      list of conditions and the following disclaimer.\n\n    * Redistributions in binary form must reproduce the above copyright notice,\n      this list of conditions and the following disclaimer in the documentation\n      and/or other materials provided with the distribution.\n\n    * Neither the name of the copyright holder nor the names of its\n      contributors may be used to endorse or promote products derived from\n      this software without specific prior written permission.\n\n    THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\n    AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\n    IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\n    DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\n    FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\n    DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\n    SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\n    CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\n    OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n    OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\n## Features\n\nThis is the list of the main features implemented in Flink Controller:\n- Manage deployment of clusters and applications (aka jobs) \n- Dynamically associate jobs to clusters\n- Provide resilient and self-healing infrastructure \n- Deploy separate supervisor for each cluster\n- Automatically restart clusters or jobs when resources are modified\n- Automatically create a savepoint before stopping a job\n- Automatically recover from latest savepoint when restarting a job\n- Support automatic and periodic savepoints  \n- Support batch and stream jobs\n- Ability to rescale clusters and jobs via Kubernetes scale interface\n- Support autoscaling based on custom metrics (compatible with HPA)\n- Allow configuration of init containers and side containers in JobManager and TaskManager pods\n- Allow configuration of user annotations   \n- Allow configuration of container resources, environment variables, ports, and volumes \n- Allow configuration of security context, service account, affinity, and tolerations \n- Support pull secrets and private registries\n- Support for public Flink images or custom images\n- Provide CLI and REST interface to support operations\n- Provide metrics compatible with Prometheus\n\n## Overview\n\nFlink Controller is implemented as single command line tool, called flinkctl, however, it fulfills 4 different functions:   \n- it can launch the Operator process\n- it can launch the Supervisor process\n- it can launch the Bootstrap process\n- it provides a command line interface for interacting with the operator\n\nEach function is activated by passing the relevant arguments to the flinkctl command (see [MANUAL](https://github.com/nextbreakpoint/flink-controller/blob/master/MANUAL.md)).\n\nFlink Controller supports the following custom resources:\n- FlinkDeployment: it represents a cluster deployment, and it provides the configuration of the cluster with an optional list of jobs.\n- FlinkCluster: it represents a cluster, and it stores the configuration and the status of the cluster.\n- FlinkJob: it represents a job, and it stores the configuration and the status of the job.\n\nThe FlinkDeployment, FlinkCluster and FlinkJob resources are called primary resources.\n\nThe primary resources can be created and modified using Kubernetes native tooling like kubectl or helm.\n\nOnce installed, the operator observes a given namespace and reacts to any change in the primary resources in that namespace.\nPlease note that it is recommended to install the operator in a separate namespace, different from the one observed, and\nrestrict the access to the operator's namespace to improve stability. Moreover, multiple operators are required \nto observe multiple namespaces (this might change in the near future).\n\nFor instance, when a deployment resource is created, the operator detects the resource and creates a supervisor, \nthen the supervisor creates the JobManager and TaskManagers, and bootstraps the jobs. Once the jobs are running,\nthe supervisor will make sure they are still running, and it will do the best effort to recover from temporary issues.  \n\nBoth operator and supervisor detect changes in the primary resources, and eventually creates,\nupdates or deletes one or more secondary resources, such as Pods, Services and BatchJobs, however, \nthey have different responsibilities: the operator is responsible for reconciling the status of all supervisors \n(one for each cluster), and the supervisor is responsible for reconciling the status of a cluster and its jobs.\n\nThe status of clusters and jobs is persisted in the primary resources, and it can be inspected with flinkctl or kubectl.\n\nThe supervisor can perform several tasks automatically, such as creating savepoints when a job is restarted,\nor restarting the cluster when the specification has changed. The automation provided by the operator and \nthe supervisor reduces the operational effort required to manage Flink on Kubernetes.  \n\nThe deployment resource is convenient for defining a cluster with multiple jobs as single resource,\nhowever, clusters and jobs can be created independently. Jobs can be added or removed to an existing\ncluster either updating a deployment resource or directly creating or deleting new job resources.\n\nA job can be created independently of a cluster, but it can only be executed when there is a cluster for executing the job.\nA job can be associated to a cluster based on the name of the resources. The rule is that the name of the FlinkJob resource \nmust start with the name of the corresponding FlinkCluster resource, like clustername-jobname.\n\nThe dependencies between resources are represented in the following graph:\n\n![Resource dependencies](/graphs/flink-operator.png \"Resource dependencies\")\n\n## Get started\n\nDownload the Docker image with flinkctl command from Docker Hub:\n\n    docker pull nextbreakpoint/flinkctl:1.5.0\n\nExecute flinkctl as Docker container:\n\n    docker run --rm -it nextbreakpoint/flinkctl:1.5.0 --help\n\nCheck out the quickstart example:\n\nhttps://github.com/nextbreakpoint/flink-controller/blob/master/example/README.md\n\n## Installation\n\nFlink Controller requires Kubernetes 1.31, and it supports Apache Flink 1.20.\n\n### Generate SSL certificates\n\nFlink Controller provides client and server components. The client component communicates to the server component over HTTP.\nTo ensure that the communication is secure, flinkctl can use HTTPS and SSL certificates for authentication.\n\nGenerate the required keystores and truststores with self-signed certificates:\n\n    ./secrets.sh flink-operator key-password keystore-password truststore-password\n\nThe keystores and truststores will be created in the directory secrets.\n\n### How to install the operator\n\nCreate a namespace for the operator:\n\n    kubectl create namespace flink-operator\n\nThe name of the namespace can be anything you like.\n\nCreate a namespace for executing Flink:\n\n    kubectl create namespace flink-jobs\n\nThe name of the namespace can be anything you like.\n\nCreate a secret which contains the keystore and truststore files (required for enabling SSL):\n\n    kubectl -n flink-operator create secret generic flink-operator-ssl \\\n        --from-file=keystore.jks=secrets/keystore-operator-api.jks \\\n        --from-file=truststore.jks=secrets/truststore-operator-api.jks \\\n        --from-literal=keystore-secret=keystore-password \\\n        --from-literal=truststore-secret=truststore-password\n\nThe name of the secret can be anything you like.\n\nInstall the CRDs (Custom Resource Definitions):\n\n    helm install flink-controller-crd helm/flink-controller-crd\n\nInstall the required roles:\n\n    helm install flink-controller-roles helm/flink-controller-roles --namespace flink-operator --set targetNamespace=flink-jobs\n\nInstall the operator with SSL enabled:\n\n    helm install flink-controller-operator helm/flink-controller-operator --namespace flink-operator --set targetNamespace=flink-jobs --set secretName=flink-operator-ssl\n\nRemove \"--set secretName=flink-operator-ssl\" if you don't want to enable SSL.\n\nScale the operator up:\n\n    kubectl -n flink-operator scale deployment flink-operator --replicas=1\n\nIncrease the number of replicas to enable HA (High Availability).\n\nAlternatively, you can add the argument \"--set replicas=2\" when installing the operator with Helm.\n\n### How to uninstall the operator\n\nDelete all FlinkDeployment resources:\n\n    kubectl -n flink-jobs delete fd --all\n\nand wait until the resources are deleted.\n\nDelete all FlinkCluster resource:\n\n    kubectl -n flink-jobs delete fc --all\n\nand wait until the resources are deleted.\n\nDelete all FlinkJob resource:\n\n    kubectl -n flink-jobs delete fj --all\n\nand wait until the resources are deleted.\n\nStop the operator:\n\n    kubectl -n flink-operator scale deployment flink-operator --replicas=0\n\nRemove the operator:\n\n    helm uninstall flink-controller-operator --namespace flink-operator\n\nRemove the default roles:\n\n    helm uninstall flink-controller-roles --namespace flink-operator\n\nRemove the CRDs:\n\n    helm uninstall flink-controller-crd\n\nRemove the secrets:\n\n    kubectl -n flink-operator delete secret flink-operator-ssl\n\nRemove jobs namespace:\n\n    kubectl delete namespace flink-jobs\n\nRemove operator namespace:\n\n    kubectl delete namespace flink-operator\n\nPlease note that Kubernetes is not able to remove all resources until there are finalizers pending.\nThe operator and the supervisor are responsible for removing the finalizers but, in case of misconfiguration, \nthey might not be able to properly remove the finalizers. If you are in such situation, you can always manually\nremove the finalizers to allow Kubernetes to delete all resources.\n\n### How to upgrade the operator to a new version\n\nPLEASE NOTE THAT THE OPERATOR IS STILL EXPERIMENTAL, THEREFORE EACH RELEASE MIGHT INTRODUCE BREAKING CHANGES.\n\nBefore upgrading to a new release, you must cancel all jobs creating a savepoint into a durable storage location (for instance AWS S3).\n\nCreate a copy of your FlinkDeployment resources:\n\n    kubectl -n flink-operator get fd -o yaml \u003e deployments-backup.yaml\n\nCreate a copy of your FlinkCluster resources:\n\n    kubectl -n flink-operator get fc -o yaml \u003e clusters-backup.yaml\n\nCreate a copy of your FlinkJob resources:\n\n    kubectl -n flink-operator get fj -o yaml \u003e jobs-backup.yaml\n\nUpgrade the roles:\n\n    helm upgrade flink-controller-roles --install helm/flink-controller-roles --namespace flink-operator --set targetNamespace=flink-jobs\n\nUpgrade the CRDs:\n\n    helm upgrade flink-controller-crd --install helm/flink-controller-crd\n\nAfter installing the new CRDs, you can recreate all the custom resources. However, the old resources might not be compatible \nwith the new CRDs. If that is the case, you have to fix the resource editing the yaml file and then recreate the resource. \nIf you want to restore the latest savepoint of a job, copy the savepoint path from the backup into the new resource.\n\nFinally, upgrade and restart the operator:\n\n    helm upgrade flink-controller-operator --install helm/flink-controller-operator --namespace flink-operator --set targetNamespace=flink-jobs --set secretName=flink-operator-ssl --set replicas=1\n\n## User Manual\n\nSee manual for detailed instructions about how to use Flink Controller:\n\nhttps://github.com/nextbreakpoint/flink-controller/blob/master/MANUAL.md\n\n\n## Contribute\n\nVisit the project on GitHub:\n\nhttps://github.com/nextbreakpoint/flink-controller\n\nReport an issue or request a feature:\n\nhttps://github.com/nextbreakpoint/flink-controller/issues\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnextbreakpoint%2Fflink-controller","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnextbreakpoint%2Fflink-controller","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnextbreakpoint%2Fflink-controller/lists"}