An open API service indexing awesome lists of open source software.

https://github.com/KissConsult/Apache-Airflow

Detailed tutorial for installing Apache Airflow on IBM Cloud
https://github.com/KissConsult/Apache-Airflow

Last synced: 3 months ago
JSON representation

Detailed tutorial for installing Apache Airflow on IBM Cloud

Awesome Lists containing this project

README

        

# Get Apache Airflow on IBM Cloud

We will deploy Apache Airflow on an IBM Cloud Kubernetes Cluster

* Prerequisites :
* You should have an IBM Cloud account, otherwise you can [register here].

1. Provisioning a new Kubernetes Cluster, if already have one skip to step **2**
2. Deploying the IBM Cloud Block Storage plug-in, if already have it skip to step **3**
3. Deploying Apache Airflow

## Step 1 provisioning a new Kubernetes Cluster

* Click the **Catalog** button on the top
* Select **Service** from the left in the catalog
* Search for **Kubernetes Service** and click on it
![Kubernetes](/kubernetes-select.png)
* At the Kubernetes deployment page, we will specify our deployment details
* Choose a plan **standard** or **free**, the free plan only has one worker node and no subnet, to provision a standard cluster, you will need to upgrade you account to Pay-As-You-Go
* To upgrade to a Pay-As-You-Go account, complete the following steps:

* In the console, go to Manage > Account.
* Select Account settings, and click Add credit card.
* Enter your payment information, click Next, and submit your information
* Choose **classic** or **VPC**, read the [docs] and choose the most suitable type for yourself
![VPC](/infra-select.png)
* Please decide on your deployment's location parameters , for more information please visit [Locations]
* Choose **Geography** (continent)
![continent](/location-geo.png)
* Choose **Single** or **Multizone**, in single zone your data is only kept in one datacenter, with Multizone your data is kept on multiple sites for more security
![avail](/location-avail.png)
* Choose a **Worker Zone** if using Single zones or **Metro** if Multizone
![worker](/location-worker.png)
* If you wish to use Multizone please set up your account with [VRF] or [enable Vlan spanning]
* At your current location selection, it is possible there is no Virtual LAN currently available, then a new Vlan will be created for you

* Choose a **Worker node setup** or use the preselected one, set **Worker node amount per zone**
![worker-pool](/worker-pool.png)
* Choose **Master Service Endpoint**, In VRF-enabled accounts, you can choose private-only to make your master accessible on the private network or via VPN tunnel. Choose public-only to make your master publicly accessible. When you have a VRF-enabled account, your cluster is set up by default to use both private and public endpoints. For more information visit [endpoints].
![endpoints](/endpoints.png)
* Give cluster a **name**

![name-new](/name-new.png)
* Give desired **tags** to your cluster, for more information visit [tags]

![tags-new](/tasg-new.png)
* Click **create**
![create-new](/create-new.png)

* Wait for you cluster to be provisioned
![cluster-prepare](/cluster-prepare.png)
* Your cluster is ready for usage

![cluster-ready](/cluster-done.png)

## Step 2 deploy IBM Cloud Block Storage plug-in
The Block Storage plug-in is a persistent, high-performance iSCSI storage that you can add to your apps by using Kubernetes Persistent Volumes (PVs).

* Click the **Catalog** button on the top
* Select **Software** from the catalog
* Search for **IBM Cloud Block Storage plug-in** and click on it
![Block](/block-search.png)

* On the application page Click in the _dot_ next to the cluster, you wish to use
* Click on **Enter or Select Namespace** and choose the default Namespace or use a custom one (if you get error please wait 30 minutes for the cluster to finalize)
![block-c](/block-cluster.png)
* Give a **name** to this workspace
* Click **install** and wait for the deployment
![block-create](/block-storage-create.png)

## Step 3 Deploy Apache Airflow

In this step we will deploy Apache Airflow on our cluster

* Click the **Catalog** button on the top
* Select **Software** from the left in the catalog
* Search for **Apache Aifrlow** and click on it
![Search](/search.png)

* On the application page Click in the _dot_ next to the cluster we just created or use an existing one
![Cluster](/cluster-select.png)
* Click on **Enter or Select Namespace** and choose one of the default Namespaces or use a custom one
![Namespace](/details-namespace.png)
* Give a unique **name** to your workspace

![Name](/details-name.png)
* Select which resource group you want to use, it is for access controll and billing purposes. For more information please visit [resource groups]

![apache-resource](/details-resource.png)

* Here you can give **tags** to your apache airflow workspace, which will affect your deployment. For more information visit [tags]

![apache-tags](/details-tags.png)

* Click on **Parameters with default values**, You can set deployment values or use the default ones

![def-val](/parameters.png)

* Please **tick** the box next to the agreements and click **install**

![Install](/aggreement-create.png)

* Your apache airflow workspace will start installing, please wait a couple of minutes for the deployment to finish

![airflow-progress](/in-progress.png)

* You apache airflow workspace has been successfully deployed

![airflow-finsihed](/airflow-done.png)

## Verify Apache Airflow installation

* Go to [Resources] in your browser
* Click on **Clusters**
* Click on your Cluster
![Resourcelect](/resource-select.png)

* Now you are at you cluster's overview, here Click on **Actions** on the top right and click on **Web terminal** from the dropdown menu

![Actions](/cluster-main.png)

* Click **install**, then wait couple of minutes

![terminal-install](/terminal-install.jpg)

* Click on **Actions**
* Click **Web terminal** --> a terminal will open up

* **Type** in the terminal, please change NAMESPACE to the namespace you choose at the deployment setup:

```sh
$ kubectl get ns
```
![get-ns](/get-ns.png)

```sh
$ kubectl get pod -n NAMESPACE -o wide
```
![get-pod](/get-pod.png)

```sh
$ kubectl get service -n NAMESPACE
```
![get-service](/get-service.png)

* Your running Apache Airflow services will be visible

You successfully deployed Apache Airflow on IBM Cloud!

[IBM Cloud]:
[Register Here]:
[guide]:
[here]:
[resource groups]:
[tags]:
[Resources]:
[Locations]:
[VRF]:
[enable Vlan spanning]:
[endpoints]:
[docs]: