Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/he7d3r/desafio-engenharia-dados
https://github.com/he7d3r/desafio-engenharia-dados
Last synced: 16 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/he7d3r/desafio-engenharia-dados
- Owner: he7d3r
- License: mit
- Created: 2020-11-04T22:58:13.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-12-08T07:39:08.000Z (about 2 years ago)
- Last Synced: 2024-10-28T16:57:23.755Z (2 months ago)
- Language: Jupyter Notebook
- Size: 342 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Setup
## Quick start
### See it in action
The dashboard is currently available at .
### Deploy a new instance
To deploy the dashboard on Heroku, just run the following command, replacing
`my-new-dashboard` by a name of your choice.```bash
git clone [email protected]:he7d3r/desafio-engenharia-dados.git
cd desafio-engenharia-dados
heroku container:login
heroku create my-new-dashboard
docker-compose build app
docker tag desafio-engenharia-dados_app:latest registry.heroku.com/my-new-dashboard/web
docker push registry.heroku.com/my-new-dashboard/web
heroku container:release web
heroku addons:create heroku-postgresql:hobby-dev
heroku run make all
```Wait a few minutes, while the raw data is downloaded, transformed, loaded into
the database and some sanity checks are performed. Once it is done, you should
be able to access the dashboard at `https://my-new-dashboard.herokuapp.com`.### Run locally
After cloning the project, and going to its folder, use Docker Compose to build and run the images as follows:
```bash
docker-compose up -d
```This should take care of building the images and running the following:
- Start a cron job to download the raw data (if not already done) into the `data` folder
- Create and populate a SQLite database in the same directory
- Do a sanity test of the data
- Make the dashboard app available at http://localhost:80 (via nginx), and at URLs such as http://localhost/dashboard/SC/2019, where "SC" and "2019" can be replaced by other state codes and years respectively.### Other useful commands
#### Debugging
You can toggle Flask's debug mode by setting the FLASK_ENV environment variable
to `production` or `development`, like this:```bash
FLASK_ENV=development docker-compose up -d app
```or by adding the same setting to a `.env` file:
```bash
echo "FLASK_ENV=development" >> .env
```#### Get the data into an SQLite database
To get a container to download the data and populate the database, run the following (replacing `` with a name of your choice):
```bash
docker build -f dashboard/Dockerfile -t .
docker run -it -e DATABASE_URL='sqlite:////data/trades.db' \
-v `pwd`/data:/data \
-v `pwd`/dashboard/src:/dashboard/src \
--name \
/bin/bash
make /data/trades.db # Inside the container
```#### Check the cron service status
You can check the status of the cron service like this:
```bash
docker exec -it service cron status
```#### Test the database
After making changes to the data pipeline, it can be useful to check if the database still contains the expected data. For this, just run `make tests` inside the `flask` container, that is:
```bash
docker-compose up -d app
docker exec -it flask make tests
```This will ensure a reasonable number of rows is present in each table.
#### Get the Flask app running
To build and run the app image:
```shell
docker build -f dashboard/Dockerfile -t .
docker run -d -e FLASK_APP='wsgi' \
-e FLASK_ENV='development' \
-e DATABASE_URL='sqlite:////data/trades.db' \
-p 5000:5000 \
-v `pwd`/data:/data \
-v `pwd`/dashboard/src:/dashboard/src \
--name \
gunicorn --reload --bind 0.0.0.0:5000 --workers 4 "wsgi:app"
```In this example, the app should be be available at http://localhost:5000/. If needed, add the option `--reload` to allow changes made to app source code (inside the folder `dashboard/src` on the host) to go live without needing to rebuild its image (reloading the page in the browser will be enough).
### Running tests
To run some basic tests, use
```bash
pytest
```## Notes
For a first look into the data, check out the Jupyter notebooks inside the directory `notebooks`.
## Database Layout
Currently, the dashboard uses a database with the following tables:
![Database Diagram](diagram.png)