Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/plishka/etl
Scroll Points Analysis
https://github.com/plishka/etl
api blockchain dune-analytics python scripts scroll sql tableau
Last synced: 4 days ago
JSON representation
Scroll Points Analysis
- Host: GitHub
- URL: https://github.com/plishka/etl
- Owner: Plishka
- Created: 2024-07-29T12:16:31.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-08-08T10:19:49.000Z (5 months ago)
- Last Synced: 2024-11-08T14:17:21.745Z (about 2 months ago)
- Topics: api, blockchain, dune-analytics, python, scripts, scroll, sql, tableau
- Language: Jupyter Notebook
- Homepage:
- Size: 354 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# ETL Project: Scroll Points Analysis
## Overview
This project focuses on extracting wallet transaction data from the Scroll blockchain via Dune Analytics, transforming this data by fetching additional points information from an AWS API, and finally loading the cleaned and enriched data into Tableau for visualization and analysis.## Process
### Visualization of the ETL Process
![ETL Process](https://github.com/Plishka/ETL/blob/main/ETL_scema.png)
### 1. Extract
- **Data Source**: Retrieved wallet addresses from Scroll blockchain transaction data using Dune Analytics.
```sql
SELECT
"from" as wallet
FROM scroll.transactions
GROUP BY
"from"
ORDER BY
sum(gas_used) DESC -- To ensure wallets with more points go first
```- **API Endpoint**: Set up a custom API endpoint in Dune Analytics.
- **Data Retrieval**: Developed a [Python script](https://github.com/Plishka/ETL/blob/main/Scroll%20Wallets%20fetch%20from%20Dune%20API.ipynb) to fetch wallet data from the Dune API and save it into a CSV file.
- **Parallel Data Fetching**: Developed an [advanced Python script](https://github.com/Plishka/ETL/blob/main/Scroll%20Marks%20fetch%20from%20AWS%20API.ipynb) to fetch additional data in parallel requests from an AWS API using the wallet addresses from the CSV file, significantly speeding up the data retrieval process.### 2. Transform
- **Data Cleaning**: Cleaned the fetched data to handle inconsistencies, Null values and error
- **Data Transformation**: Transformed the data by rounding numerical values, managing API rate limits, and ensuring all data was correctly formatted and complete.### 3. Load
- **Data Storage**: Saved the final transformed data into a new CSV file.
- **Data Visualization**: Imported the cleaned and transformed data into Tableau to create insightful [**Dashboard**](https://public.tableau.com/app/profile/oleksandr.plishka/viz/ScrollMarksAnalysis/Dashboard1#1) for desktop and mobile version## Skills and Tools
- **PostgreSQL (Dune Analytics)**: Used for querying and extracting data from the Scroll blockchain.
- **Python**: Utilized to script the data extraction and transformation processes. Data Cleansing and pre [Data Analysis](https://github.com/Plishka/ETL/blob/main/Scroll%20Analysis.ipynb)
- Libraries: `numpy`, `pandas`, `requests`, `concurrent.futures` for parallel processing, `matplotlib`, `seaborn`, `scipy`
- **API Integration and Management**: Expertise in handling API requests, managing rate limits, and error handling.
- **Data Cleaning and Transformation**: Ensured data accuracy and consistency through various cleaning and transformation techniques.
- **CSV File Handling**: Managed large datasets efficiently using CSV files.
- **Tableau**: Used to create visualizations to analyze and present the data.## Scripts
- #### [Extract Wallet Addresses from Dune API](https://github.com/Plishka/ETL/blob/main/Scroll%20Wallets%20fetch%20from%20Dune%20API.ipynb)
- #### [Extracting Points from AWS API](https://github.com/Plishka/ETL/blob/main/Scroll%20Marks%20fetch%20from%20AWS%20API.ipynb)