Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/aws-samples/query-data-in-s3-with-amazon-athena-and-aws-sdk-for-dotnet
This Project provides a sample implementation that will show how to leverage Amazon Athena from .NET Core Application using AWS SDK for .NET to run standard SQL to analyze a large amount of data in Amazon S3.
https://github.com/aws-samples/query-data-in-s3-with-amazon-athena-and-aws-sdk-for-dotnet
amazon-athena aws-sdk dotnet dotnet-core s3-bucket
Last synced: 2 months ago
JSON representation
This Project provides a sample implementation that will show how to leverage Amazon Athena from .NET Core Application using AWS SDK for .NET to run standard SQL to analyze a large amount of data in Amazon S3.
- Host: GitHub
- URL: https://github.com/aws-samples/query-data-in-s3-with-amazon-athena-and-aws-sdk-for-dotnet
- Owner: aws-samples
- License: mit-0
- Created: 2020-11-06T20:30:04.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2023-10-05T05:55:26.000Z (over 1 year ago)
- Last Synced: 2024-08-03T22:17:18.321Z (6 months ago)
- Topics: amazon-athena, aws-sdk, dotnet, dotnet-core, s3-bucket
- Language: C#
- Homepage:
- Size: 5.65 MB
- Stars: 23
- Watchers: 10
- Forks: 8
- Open Issues: 14
-
Metadata Files:
- Readme: readme.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Codeowners: .github/CODEOWNERS
Awesome Lists containing this project
- jimsghstars - aws-samples/query-data-in-s3-with-amazon-athena-and-aws-sdk-for-dotnet - This Project provides a sample implementation that will show how to leverage Amazon Athena from .NET Core Application using AWS SDK for .NET to run standard SQL to analyze a large amount of data in Am (C# #)
README
# How to use SQL to query data in S3 Bucket with Amazon Athena and AWS SDK for .NET
This Project provides a sample implementation that will show how to leverage [Amazon Athena](https://aws.amazon.com/athena/) from .NET Core Application using [AWS SDK for .NET](https://docs.aws.amazon.com/sdk-for-net/v3/developer-guide/welcome.html) to run standard SQL to analyze a large amount of data in [Amazon S3](https://aws.amazon.com/s3/).
To showcase a more realistic use-case, it includes a WebApp UI developed using [ReactJs](https://reactjs.org/). this WebApp contains components to demonstrate fetching COVID-19 data from API Server that uses AWS SDK for .NET to connect to Amazon Athena and run SQL Standard query from datasets on Amazon S3 files from a Data Lake account. This Data Lake account is the [aws-covid19-lake](https://registry.opendata.aws/aws-covid19-lake/) account, made available on [Registry of Open Data on AWS](https://registry.opendata.aws/)Those ReatJs Components call .NET Core API that runs Amazon Athena Query, check the execution status, and list results. Each menu presents different views.
**Menu option _Testing By Date_**: Shows a filter by Date that presents a table with the following data: Date, State, Positive, Negative, Pending, Hospitalized, Death, Positive Increase
**Menu option _Testing By State_**: Shows a filter by State that presents a table with the following data: Date, State, Positive, Negative, Pending, Hospitalized, Death Positive Increase
**Menu option _Hospitals (Run&Go)_**: Run a request to the API server, get 200 with the Query ID, check the status of the execution; when the execution it's completed, it presents a table with the following data: Name, State, Type, ZipCode, Licensed Beds, Staffed Beds, Potential Increase in Beds
**Menu option _Hospitals (Run&Go)_**: Run request to the API server, wait for the result and presents a table with the following data: Name, State, Type, Zip Code, Licensed Beds, Staffed Beds, Potential Increase in Beds
# Steps
To run this project follow the instructions bellow:
## 1) Deploy Glue Catalog & Athena Database/Tables
```bash
#1) Deploy
aws cloudformation create-stack --stack-name covid-lake-stack --template-url https://covid19-lake.s3.us-east-2.amazonaws.com/cfn/CovidLakeStack.template.json --region us-west-2#2) Check deployment Status
aws cloudformation describe-stacks --stack-name covid-lake-stack --region us-west-2
```Below the result of status check, wait for **"StackStatus": "CREATE_COMPLETE"**
```json
{
"Stacks": [
{
"StackId": "arn:aws:cloudformation:us-west-2:XXXXXXXX9152:stack/covid-lake-stack/xxxxxxxx-100d-11eb-87ef-xxxxxxxxxxx",
"StackName": "covid-lake-stack",
"CreationTime": "2020-10-17T00:12:09.151Z",
"RollbackConfiguration": {},
"StackStatus": "CREATE_COMPLETE",
"DisableRollback": false,
"NotificationARNs": [],
"Tags": [],
"EnableTerminationProtection": false,
"DriftInformation": {
"StackDriftStatus": "NOT_CHECKED"
}
}
]
}
```## 2) Create S3 bucket for Athena Result
```bash
#1) Deploy S3 Bucket
aws cloudformation deploy --stack-name athena-results-netcore --template-file ./src/cloud-formation-templates/s3-athena-result.template.yaml --region us-west-2#2) Check deployment Status
aws cloudformation describe-stacks --stack-name athena-results-netcore --region us-west-2
```Below the result of status check, wait for **"StackStatus": "CREATE_COMPLETE"** and copy output Bucket Name **"OutputValue": "s3://athena-results-netcore-s3bucket-xxxxxxxxxxxx/athena/results/",** you will need this to run your code
```json
{
"Stacks": [
{
"StackId": "arn:aws:cloudformation:us-west-2:XXXXXXXX9152:stack/athena-results-netcore/xxxxxxxx-100c-11eb-889f-xxxxxxxxxxx",
"StackName": "athena-results-netcore",
"Description": "Amazon S3 bucket to store Athena query results",
"CreationTime": "2020-10-17T00:02:44.968Z",
"LastUpdatedTime": "2020-10-17T00:21:13.692Z",
"RollbackConfiguration": {
"RollbackTriggers": []
},
"StackStatus": "CREATE_COMPLETE",
"DisableRollback": false,
"NotificationARNs": [],
"Outputs": [
{
"OutputKey": "BucketName",
"OutputValue": "s3://athena-results-netcore-s3bucket-xxxxxxxxxxxx/athena/results/",
"Description": "Name of the Amazon S3 bucket to store Athena query results"
}
],
"Tags": [],
"EnableTerminationProtection": false,
"DriftInformation": {
"StackDriftStatus": "NOT_CHECKED"
}
}
]
}
```## 3) COVID-19 Analisys (optional)
Some SQL Query that you can try on your own using [Amazon Athena Console UI]((https://us-west-2.console.aws.amazon.com/athena/home?region=us-west-2#query/)). This step is optional for this demo, but it helps you explore and learn more about Amazon Athena using Console UI
```sql
-- The following query returns the growth of confirmed cases for the past 7 days joined side-by-side with hospital bed availability, broken down by US county:
SELECT
cases.fips,
admin2 as county,
province_state,
confirmed,
growth_count,
sum(num_licensed_beds) as num_licensed_beds,
sum(num_staffed_beds) as num_staffed_beds,
sum(num_icu_beds) as num_icu_beds
FROM
"covid-19"."hospital_beds" beds,
( SELECT
fips,
admin2,
province_state,
confirmed,
last_value(confirmed) over (partition by fips order by last_update) - first_value(confirmed) over (partition by fips order by last_update) as growth_count,
first_value(last_update) over (partition by fips order by last_update desc) as most_recent,
last_update
FROM
"covid-19"."enigma_jhu"
WHERE
from_iso8601_timestamp(last_update) > now() - interval '200' day AND country_region = 'US') cases
WHERE
beds.fips = cases.fips AND last_update = most_recent
GROUP BY cases.fips, confirmed, growth_count, admin2, province_state
ORDER BY growth_count desc--Last 10 records regarding Testing and deaths
SELECT * FROM "covid-19"."world_cases_deaths_testing" order by "date" desc limit 10;-- Last 10 records regarding Testing and deaths with JOIN on us_state_abbreviations to list State name
SELECT
date,
positive,
negative,
pending,
hospitalized,
death,
total,
deathincrease,
hospitalizedincrease,
negativeincrease,
positiveincrease,
sta.state AS state_abbreviation,
abb.stateFROM "covid-19"."covid_testing_states_daily" sta
JOIN "covid-19"."us_state_abbreviations" abb ON sta.state = abb.abbreviation
limit 500;
```## 4) Build & Run .NET Web Application
1) Go to the app root dir
```bash
cd ./src/app/AthenaNetCore/
```2) Create AWS Credential file, **_for security precaution the file extension *.env is added to .gitignore to avoid accidental commit_**
```bash
code aws-credentials-do-not-commit.env #You can use any text editor eg: vi -> vi aws-credentials-do-not-commit.env
```Below example of env file content, replace the XXXX... with your real AWS Credential, and add to S3_RESULT the output result you got from steep 2)
```ini
AWS_DEFAULT_REGION=us-west-2
AWS_ACCESS_KEY_ID=XXXXXXXXXXXXXXXXXXXX
AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
AWS_SESSION_TOKEN=XXXXX #(Optional, used only in case of temporary token, you'll need to remove this comment on the .env file)
S3_RESULT_BUCKET_NAME=s3://athena-results-netcore-s3bucket-xxxxxxxxxxxx/athena/results/ #paste the bucket name you've copied on the step 2, you'll need to remove this comment on the .env file)```
3) Build .NET APP using docker-compose
```bash
docker-compose -f ./docker-compose.yml build
```4) Run .NET APP docker-compose
```bash
docker-compose -f ./docker-compose.yml up
```5) Test .NET APP via URL
6) Clean up
```bash
# 1) Clean local resources
docker-compose down -v# 2) Clean s3 objects created by Athena to store Results metadata
aws s3 rm --recursive s3://athena-results-netcore-s3bucket-xxxxxxxxxxxx/athena/results/# 3) Delete S3 bucket
aws cloudformation delete-stack --stack-name athena-results-netcore --region us-west-2# 4) Delete Athena Tables
aws cloudformation delete-stack --stack-name covid-lake-stack
```
# References