https://github.com/paladique/codespaces-etl-basic-demo
ETL with Jupyter Notebooks, Pandas, and Azure Cosmos DB
https://github.com/paladique/codespaces-etl-basic-demo
azure azure-cosmos-db codespaces data-engineering etl pandas
Last synced: 4 months ago
JSON representation
ETL with Jupyter Notebooks, Pandas, and Azure Cosmos DB
- Host: GitHub
- URL: https://github.com/paladique/codespaces-etl-basic-demo
- Owner: paladique
- License: mit
- Created: 2023-06-23T04:17:43.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-10-05T15:04:49.000Z (about 2 years ago)
- Last Synced: 2024-10-18T15:19:10.143Z (12 months ago)
- Topics: azure, azure-cosmos-db, codespaces, data-engineering, etl, pandas
- Language: Jupyter Notebook
- Homepage:
- Size: 376 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ETL in GitHub Codespaces
[](https://codespaces.new/paladique/codespaces-etl-basic-demo)
[Sign up for Azure](https://azure.microsoft.com/free/?WT.mc_id=academic-99884-jasmineg)
🎓 Students get $100 of credits with Azure for Students! No credit card required: [Sign up](https://aka.ms/azure4students)
## Extract, Transform, and Load Operations with Python & Pandas library with Jupyter Notebooks and Azure Cosmos DB
This sample loads a csv file as a Pandas dataframe, filters the records by airports located in the United States, then the filtered data into JSON. A sample of the pared data is then loaded into Azure Cosmos DB.
### Instructions
Convert a filtered CSV File into JSON, then insert into Azure Cosmos DB in minutes with GitHub Codespaces1. [Create a Cosmos DB NoSQL Account, you can stop after creating the resource](https://learn.microsoft.com/azure/cosmos-db/nosql/quickstart-portal?WT.mc_id=academic-99884-jasmineg)
2. After account is created in the Azure Portal, navigate to the resource (you can find it in your notifications)
3. To the right of the resource overview select `Keys`. locate the `URI`, and `PRIMARY KEY` secrets.

4. Set copied secrets it as secrets in your [Codespaces settings here](https://github.com/settings/codespaces).
**`URI` should be the `COSMOS_ENDPOINT` secret and `PRIMARY KEY` should be the `COSMOS_KEY` secret**
5. Run the Notebook
6. **[Clean up your Cosmos DB Account Resources after you're done!](https://learn.microsoft.com/en-us/cosmos-db/nosql/quickstart-portal#clean-up-resources?WT.mc_id=academic-99884-jasmineg)**
## Learn More
- [GitHub Codespaces]()
- [Azure Cosmos DB](https://learn.microsoft.com/training/modules/explore-non-relational-data-stores-azure/?WT.mc_id=academic-99884-jasmineg)