{"id":20519822,"url":"https://github.com/dianaow/flask-react-d3-celery","last_synced_at":"2025-04-14T02:12:16.653Z","repository":{"id":45123766,"uuid":"143532904","full_name":"dianaow/flask-react-d3-celery","owner":"dianaow","description":"A full-stack dockerized web application to visualize Formula 1 race statistics from 2016 to present, with a Python Flask server and a React front-end with d3.js as data visualization tool.","archived":false,"fork":false,"pushed_at":"2022-12-09T05:33:10.000Z","size":5486,"stargazers_count":43,"open_issues_count":22,"forks_count":7,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-03-27T16:11:13.806Z","etag":null,"topics":["d3-visualization","d3js","d3v4","data-visualization","docker","ergast-api","flask","formula1","react","react-flask","sports","sports-data","sports-stats"],"latest_commit_sha":null,"homepage":"","language":"JavaScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dianaow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-08-04T13:52:43.000Z","updated_at":"2024-09-25T10:24:18.000Z","dependencies_parsed_at":"2023-01-25T16:45:58.489Z","dependency_job_id":null,"html_url":"https://github.com/dianaow/flask-react-d3-celery","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dianaow%2Fflask-react-d3-celery","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dianaow%2Fflask-react-d3-celery/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dianaow%2Fflask-react-d3-celery/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dianaow%2Fflask-react-d3-celery/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dianaow","download_url":"https://codeload.github.com/dianaow/flask-react-d3-celery/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248809051,"owners_count":21164896,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["d3-visualization","d3js","d3v4","data-visualization","docker","ergast-api","flask","formula1","react","react-flask","sports","sports-data","sports-stats"],"created_at":"2024-11-15T22:16:30.960Z","updated_at":"2025-04-14T02:12:16.629Z","avatar_url":"https://github.com/dianaow.png","language":"JavaScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Data Visualization for Formula 1 Races\n\nA full-stack dockerized web application to visualize Formula 1 race statistics from 2016 to present, with a Python Flask server and a React front-end with d3.js as data visualization tool. \n\n## Hosted at: www.notforcasualfans.com\n\n- I have moved the front-end section (React) to another repository: https://github.com/dianaow/d3-react\n- Likewise, for the back-end section (Flask+Postgresql): https://github.com/dianaow/flask-backend\n\nI will no longer be updating this repository and the latest code changes will be commited to the above repositories instead.\n\n\u003cimg src=\"https://github.com/dianaow/celery-scheduler/blob/master/misc/mystack.png\" alt=\"mystack\" width=\"600\"/\u003e\n\n## Data Source\n- Thanks to the Ergast Developer API (https://ergast.com/mrd/), which provides data for the Formula 1 series and is updated after the conclusion of each race.\n\n## How to automate the refresh/update of data visualization dashboard?\n- This requires automating the data collection process. To do this, I created a task scheduler within the app powered by Celery to fetch data from Ergast's APIs periodically. Next, I created Python scripts to perform data transformation. The processed data is then saved to a Postgresql database which is hosted on AWS. \n- Celery schedules data to be extracted from Ergast API every Monday morning. If the day before is not a race weekend (Race weekends are spread out from  March to November with races occurring on Sundays), nothing gets saved to database and the scheduler retries the following week.\n- The processed data is then retreived from database for implementing APIs.\n\n## How to connect Flask and React?\n- I used Flask to create REST APIs and have React consume the APIs by making HTTP requests to it.\n- I did not use the create-react-app library , hence I had to create a custom Webpack configuration. Webpack and Babel (transpiler to convert ES6 code into a backwards compatible version of JavaScript) bundles up the React files in a folder separate from the Flask app. \n\n## Data Visualization in React using D3(V4)\n- I used these two libraries together to create dynamic data visualization components. Data is retrieved from the APIs created by Flask.\n- React and D3 are both able to control the Document Object Mode (DOM). To separate responsibilities as much as possible, I went by the following approach:\n\n  + React owns the DOM\n  + D3 calculates properties\n\nThis way, we can leverage React for SVG structure and rendering optimizations and D3 for all its mathematical and visualization functions.\n\n## Deployment\n- The front-end and back-end was each deployed to separate AWS Elastic Beanstalk environments.\n- I attempted to deploy it as a single AWS EB app, but encountered some issues configuring Nginx to frontend my backend services. Furthermore, the larger app size meant i have to upgrade the EC2 instance type to 't2.small', which had to be paid for. Hence, it was a more viable option to deploy to two separate EC2 instances.\n- The Webpack build to generate static assets happens locally before deployment and the generated files are bundled with the deployment package. \n\n## Architecture\n\n![architecture_diagram](https://github.com/dianaow/celery-scheduler/blob/master/misc/flask_react_celery_architecture.png) \n\n# Getting Started\n\n## Setup\nThis setup is built for deployment with Docker. \n\n### **1. Clone the repository**\n\n```bash\ncd ~\ngit clone https://github.com/dianaow/.git\ncd celery-scheduler\n```\n\n### **2. Install Docker**\n\n- [Mac or Windows](https://docs.docker.com/engine/installation/)\n- [Ubuntu server](https://www.digitalocean.com/community/tutorials/how-to-install-and-use-docker-on-ubuntu-16-04)\n\n\n### **3. Build docker images with docker-compose and run it.**\n\n  Configuration folder architecture:\n  ```\n  config  \n  │\n  └───docker\n  │   │\n  │   └───development\n  │   │   │   dev-variables.env\n  │   │   │   docker-compose.yml\n  │   │ \n  │   └───production\n  │       │   .env\n  │       │   docker-compose-prod.yml\n  │      \n  │   \n  │ Dockerfile\n  │ Dockerfile-node\n  ```\n  To test the app locally, first enter the correct folder. \n  ```\n  cd config/docker/development\n  ```\n  Then execute the following command:\n  ```\n  docker-compose -f docker-compose.yml up -d --build\n  ```\n  \n  I have configured Docker such that when the postgres image is built and an instance (container) of it runs, a new database is created along with a postgres user and password. However, the database is currently empty and requires a psql script to load it with some data. The database shuts down when the container stops and is removed.\n  \n  Note:\n  - -f: specify docker-compose file name (Not necessary to specify, unless named differently from standard 'docker-compose.yml'\n  - up: Builds, (re)creates, starts, and attaches to containers for a service.\n  - -d: Detached mode: Starts containers in the background and leaves them running \n  - --build: Build images before creating containers.\n  \n  \n### **4) Check logs for successful build and run of docker containers**\n```\n  docker-compose logs\n```\n\n  Please refer to this repo's wiki for screenshots of what you should see from the console.\n \n \n### **5) Loading database with data**\n\n  I am unable to succesfully use an entrypoint script to initialize database with data, hence the workaround will be to manually load data from command line instead.\n \n#### **a.** Check the list of running containers \n```\n  docker ps -a\n```\n\n![docker_compose_ps_a](https://github.com/dianaow/celery-scheduler/blob/master/misc/docker_compose_ps_a.png) \n\n#### **b.** To run bash command in docker container, enter\n```\n  docker exec -i -t \u003cCONTAINER_ID\u003e /bin/bash\n```\n \n  In this case, we want to enter the 'development_postgresql' container, so run ```docker exec -i -t 899f05bf2ec2 /bin/bash``` (Note: my CONTAINER ID will be different from yours)\n \n#### **c.** Run below command to dump 'init.psql' to the database\n \n```\n  psql --host=localhost --port=5432 --username=test_user --password --dbname=f1_flask_db \u003c ../init.psql\n```\n \n  You will then be prompted for the password for test_user, which is **'test_pw'**\n \n#### **d.** Log into the database. Try querying it!\n \n![docker_command_psql](https://github.com/dianaow/celery-scheduler/blob/master/misc/docker_command_psql.png) \n \n \n**You may now point your browser to http://localhost:3000 to view the frontend, and to http://localhost:5000/api/results to view the APIs**\n\n\n### **6) Initialize task scheduler with celery**\n \n#### **a.** Based on the last step, we should still be in 'development_postgresql' container. Exit from postgresql by entering '\\q'. Exit from container by entering 'exit'. Next, identify the \"development_app\" container id and enter it (similar to steps 3b.1 and 3b.2).\n\n#### **b.** Trigger the below command: \n```\n  celery -A app.tasks worker -B -l info\n```\n  \n  That's it! For testing purpose, i have set celerybeat to trigger task to data collect every 15 minutes.\n\n\n### **7) To stop running of docker containers and remove them**\n```\n  docker-compose down\n```\n\n\u003cimg src=\"https://github.com/dianaow/celery-scheduler/blob/master/misc/docker_compose_down.png\" alt=\"docker_compose_down\" width=\"300\"/\u003e\n\n   \n**For enquiries, you may contact me at diana.ow@gmail.com**\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdianaow%2Fflask-react-d3-celery","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdianaow%2Fflask-react-d3-celery","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdianaow%2Fflask-react-d3-celery/lists"}