https://github.com/vincentclaes/serverless_data_pipeline_example
  
  
    Build and Deploy A Serverless Data Pipeline on AWS  
    https://github.com/vincentclaes/serverless_data_pipeline_example
  
aws aws-glue aws-lambda aws-s3 python serverless-framework
        Last synced: 3 months ago 
        JSON representation
    
Build and Deploy A Serverless Data Pipeline on AWS
- Host: GitHub
 - URL: https://github.com/vincentclaes/serverless_data_pipeline_example
 - Owner: vincentclaes
 - Created: 2019-07-03T12:26:14.000Z (over 6 years ago)
 - Default Branch: master
 - Last Pushed: 2022-12-08T06:11:38.000Z (almost 3 years ago)
 - Last Synced: 2024-08-14T07:09:20.544Z (about 1 year ago)
 - Topics: aws, aws-glue, aws-lambda, aws-s3, python, serverless-framework
 - Language: Python
 - Homepage: https://medium.com/@vincentclaes_43752/build-a-serverless-data-pipeline-on-aws-7c7d498d9707
 - Size: 466 KB
 - Stars: 27
 - Watchers: 3
 - Forks: 13
 - Open Issues: 8
 - 
            Metadata Files:
            
- Readme: README.md
 
 
Awesome Lists containing this project
- jimsghstars - vincentclaes/serverless_data_pipeline_example - Build and Deploy A Serverless Data Pipeline on AWS (Python)
 
README
          # SERVERLESS-DATA-PIPELINE
Using AWS Cloud Services Lambda, S3, Glue and Athena we are going to build a data pipeline written in python and deploy it using the Serverless Framework.
You can read the article here: 
https://medium.com/@vincentclaes_43752/build-a-serverless-data-pipeline-on-aws-7c7d498d9707
# Deploy and run the data pipeline
make sure you have correct user and roles defined:
### Create a user to access AWS
create an admin user using the AWS console and set the credentials under a [serverless] section in the credentials file located in 
    
    ~/.aws/credentials
a step by step guide can be found here: https://medium.com/@vincentclaes_43752/create-a-user-for-the-serverless-framework-8e5c336d47c7
### Create the necessary roles
for our project we need two roles; 
* one for lambda 
* one for glue.
create a role with administrator access for both these roles and keep the ARN somewhere accessible. 
A step by step guide to creating a role can be found here: https://medium.com/@vincentclaes_43752/create-a-role-on-aws-for-the-serverless-framework-for-any-resource-c49712a5eee0
Replace the ARN of the lambda role and the ARN of the Glue role with the ones defined in the serverless.yml file.
There is a comment above both roles that indicates that you should replace the ARN.
### configure your local environment
    git clone https://github.com/vincentclaes/serverless_data_pipeline_example.git
    cd serverless_data_pipeline_example
under the root of this project execute:
    export AWS_PROFILE="serverless"
    pip install awscli
    sudo npm install -g serverless
    npm install serverless-s3-remover
    
to deploy the data pipeline execute:
    # replace the `unique-identifier` with something unique
    sls deploy --stage unique-identifier
    
to remove the data pipeline execute:
    # replace the `unique-identifier` with something unique
    sls remove --stage unique-identifier
    
to get more context please refer to the article: https://medium.com/@vincentclaes_43752/build-a-serverless-data-pipeline-on-aws-7c7d498d9707