Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/alimghmi/crypto-news-etl
A simple ETL data pipeline using python and sqlite3
https://github.com/alimghmi/crypto-news-etl
beautifulsoup crawling etl-pipeline python scraper sqlite3
Last synced: 9 days ago
JSON representation
A simple ETL data pipeline using python and sqlite3
- Host: GitHub
- URL: https://github.com/alimghmi/crypto-news-etl
- Owner: alimghmi
- Created: 2022-06-16T14:25:36.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2022-07-06T13:21:18.000Z (over 2 years ago)
- Last Synced: 2024-11-12T16:26:04.139Z (2 months ago)
- Topics: beautifulsoup, crawling, etl-pipeline, python, scraper, sqlite3
- Language: Python
- Homepage:
- Size: 17.6 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# crypto-news-etl
## Description
An attempt, of course simple, to simulate workflow of an ETL data pipline. Extracting news from [cryptonews.com](https://cryptonews.com/), transforming fetched content and loading to sqlite3 database and S3 Bucket.## Installing
#### Use docker-compose:
```
git clone https://github.com/alimghmi/crypto-news-etl.git
cd crypto-news-etl
docker-compose up --build
```
Current working directory is mounted to container thus logs and sqlite3 database is accessible locally.#### Or run locally:
```
git clone https://github.com/alimghmi/crypto-news-etl.git
cd crypto-news-etl
pip3 install -r requirements.txt
python3 app.py
```