An open API service indexing awesome lists of open source software.

https://github.com/oracle-quickstart/oci-arch-load-file-into-adw-python


https://github.com/oracle-quickstart/oci-arch-load-file-into-adw-python

Last synced: 8 months ago
JSON representation

Awesome Lists containing this project

README

          

# oci-arch-load-file-into-adw-python

Oracle Functions is a serverless, highly scalable, fully managed Functions-as-a-Service platform built on Oracle Cloud Infrastructure and powered by the open-source Fn Project engine. Developers can use Oracle Functions to write and deploy code that delivers business value without worrying about provisioning or managing the underlying infrastructure. Oracle Functions is container-native, with functions packaged as Docker container images.

This reference architecture shows how you can use a serverless function to automate the process of extracting data from files generated by various databases or applications and loading the data into a data warehouse for analysis.

The architecture diagram shows the data flow. When zipped CSV files are uploaded to a specific bucket in Oracle Cloud Infrastructure Object Storage, the Events service is triggered. The emitted event invokes a function, which extracts the data from the uploaded files and loads the data into an Oracle Autonomous Data Warehouse instance. After the data is loaded to the data warehouse, the processed CSV files are moved to a different bucket in Oracle Cloud Infrastructure Object Storage.

For details of the architecture, see [_Automate loading data to a data warehouse using a serverless application_](https://docs.oracle.com/en/solutions/serverless-dataload-adw/index.html)

## Prerequisites

- Permission to `manage` the following types of resources in your Oracle Cloud Infrastructure tenancy: `vcns`, `internet-gateways`, `route-tables`, `security-lists`, `subnets`, `autonomous-database-family`, and `functionss`.

- Quota to create the following resources: 1 VCN, 1 subnets, 1 Internet Gateway, 1 route rules, 1 function, and ADW instance.

If you don't have the required permissions and quota, contact your tenancy administrator. See [Policy Reference](https://docs.cloud.oracle.com/en-us/iaas/Content/Identity/Reference/policyreference.htm), [Service Limits](https://docs.cloud.oracle.com/en-us/iaas/Content/General/Concepts/servicelimits.htm), [Compartment Quotas](https://docs.cloud.oracle.com/iaas/Content/General/Concepts/resourcequotas.htm).

## Deploy Using Oracle Resource Manager

1. Click [![Deploy to Oracle Cloud](https://oci-resourcemanager-plugin.plugins.oci.oraclecloud.com/latest/deploy-to-oracle-cloud.svg)](https://cloud.oracle.com/resourcemanager/stacks/create?region=home&zipUrl=https://github.com/oracle-quickstart/oci-arch-load-file-into-adw-python/releases/latest/download/oci-arch-load-file-into-adw-python-stack-latest.zip)

If you aren't already signed in, when prompted, enter the tenancy and user credentials.

2. Review and accept the terms and conditions.

3. Select the region where you want to deploy the stack.

4. Follow the on-screen prompts and instructions to create the stack.

5. After creating the stack, click **Terraform Actions**, and select **Plan**.

6. Wait for the job to be completed, and review the plan.

To make any changes, return to the Stack Details page, click **Edit Stack**, and make the required changes. Then, run the **Plan** action again.

7. If no further changes are necessary, return to the Stack Details page, click **Terraform Actions**, and select **Apply**.

## Deploy Using the Terraform CLI

### Clone the Module
Now, you'll want a local copy of this repo. You can make that with the commands:

git clone https://github.com/oracle-quickstart/oci-arch-load-file-into-adw-python.git
cd oci-arch-load-file-into-adw-python
ls

### Prerequisites
First off, you'll need to do some pre-deploy setup for Docker and Fn Project inside your machine:

```
sudo su -
yum update
yum install yum-utils
yum-config-manager --enable *addons
yum install docker-engine
groupadd docker
service docker restart
usermod -a -G docker opc
chmod 666 /var/run/docker.sock
exit
curl -LSs https://raw.githubusercontent.com/fnproject/cli/master/install | sh
exit
```

### Set Up and Configure Terraform

1. Complete the prerequisites described [here](https://github.com/cloud-partners/oci-prerequisites).

2. Create a `terraform.tfvars` file, and specify the following variables:

```
# Authentication
tenancy_ocid = ""
user_ocid = ""
fingerprint = ""
private_key_path = ""

# Region
region = ""

# Compartment
compartment_ocid = ""

# dbpwd-cipher
dbpwd-cipher = ""

# OCIR
ocir_user_name = ""
ocir_user_password = ""
````

### Create the Resources
Run the following commands:

terraform init
terraform plan
terraform apply

### Test the stack

You can test the stack by uploading CSV files into input-bucket bucket (files can be found in csv_files subdirectory of this repo). If event emission will be successful, the function should be executed and all CVS files should be analyzed and moved to the processed-bucket bucket. Ultimate outcome should be also visible in the ADW instance. Below you can find example outputs from the automation in a form of the command to be executed in the bash shell. This command will query ADW instance and database table date will be printed out:

```
Outputs:

ADW_query_URL_for_JSON_formatted_with_python = curl -X POST -u 'ADMIN:BEstrO0ng_#11' -H "Content-Type: application/json" --data '{}' https://BFGORDHAP4HZNLE-ADWDB.adb.us-ashburn-1.oraclecloudapps.com/ords/admin/soda/latest/regionsnumbers?action=query | python -m json.tool

$ curl -X POST -u 'ADMIN:BEstrO0ng_#11' -H "Content-Type: application/json" --data '{}' https://BFGORDHAP4HZNLE-ADWDB.adb.us-ashburn-1.oraclecloudapps.com/ords/admin/soda/latest/regionsnumbers?action=query | python -m json.tool
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1682 0 1680 100 2 1044 1 0:00:02 0:00:01 0:00:01 1045
{
"count": 4,
"hasMore": false,
"items": [
{
"created": "2021-05-13T15:32:12.377319000Z",
"etag": "1DB1E870628E4890BCB02992C01E5F8C",
"id": "A55A0BC8565745829880D9C1B5DBE6A4",
"lastModified": "2021-05-13T15:32:12.377319000Z",
"links": [
{
"href": "https://bfgordhap4hznle-adwdb.adb.us-ashburn-1.oraclecloudapps.com:443/ords/admin/soda/latest/regionsnumbers/A55A0BC8565745829880D9C1B5DBE6A4",
"rel": "self"
}
],
"value": {
"col1": "1",
"col2": "2",
"col3": "3",
"region": "AMER"
}
},
{
"created": "2021-05-13T15:32:21.636972000Z",
"etag": "49915FAFE75041A890BC1B79A0674A16",
"id": "D18048C5D68F44E6A9008113D71
(...)
````

### Destroy the Deployment
When you no longer need the deployment, you can run this command to destroy the resources:

terraform destroy

## Architecture Diagram

![](./images/oci-arch-load-file-into-adw-python.png)