Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/epicollect/epi-collect
πππ Liberate Google Takeout location data for epidemiological research and local contact tracing https://epi-collect.org
https://github.com/epicollect/epi-collect
coronavirus coronavirus-tracking covid-19 covid19 epidemiology epidemiology-analysis google-takeout
Last synced: about 2 months ago
JSON representation
πππ Liberate Google Takeout location data for epidemiological research and local contact tracing https://epi-collect.org
- Host: GitHub
- URL: https://github.com/epicollect/epi-collect
- Owner: epicollect
- License: mit
- Created: 2020-03-22T02:12:24.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2023-01-26T18:39:56.000Z (almost 2 years ago)
- Last Synced: 2024-08-03T21:03:06.773Z (5 months ago)
- Topics: coronavirus, coronavirus-tracking, covid-19, covid19, epidemiology, epidemiology-analysis, google-takeout
- Language: TypeScript
- Homepage:
- Size: 7.28 MB
- Stars: 7
- Watchers: 3
- Forks: 3
- Open Issues: 30
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- awesome-contact-tracing - Epi-Collect uses location data from Google Takeout to build an open source contact tracing dataset
README
Epi-Collect
Epi-Collect uses location data from Google Takeout to build an open source contact tracing dataset.
Website
β’ Slack
β’ Roadmap
β’ FAQ
β’ Privacy
## [Goals](./ROADMAP.md)
| Current Engineering Milestone | Current Researchers |
| --- | --- |
| __[Pre-launch](./ROADMAP.md)__ (2 active contributors) | [Become our first researcher](./RESEARCHERS.md) |- Establish privacy-respecting [best practices](./PRIVACY.md) for data donation
- Create a community-driven dataset standard for contact tracing
- Enable researchers and city health departments to investigate the spread of COVID-19 and other diseases using donated data## [Frequently Asked Questions](./FAQ.md)
Is my data kept safe and private?
Yes, and we empathize with your concern. The biggest problem with recent contact tracing solutions is that they may be a gateway to surveillance capitalism in the name of public safety. There is a shrinking window of opportunity available today to set a precedent for privacy-respecting contact tracing. As an open source project with all documentation in the open, Epi-Collect is in a unique position to do that. No one has scaled open source data donation before, and we're excited to test its potential.
Check out our Privacy living document to see how we think about this and how we hope others will too.Is my data anonymized?
Yes.
- Weβve designed our database such that there is no possible way to associate location data with your identity. If youβre an engineer, you can see our very simple database schema here.
- During data ingestion, we ask users to review every data point and delete those that they believe are personally identifiable. We also give hints about what data points may be personally identifiable.
- We do not make the dataset available to a researcher unless they pass certain verification requirements.
Please see our Privacy living document for more details.
How do I get access to the data?
Please see our guidance for researchers.
[Full FAQ](./FAQ.md)
## Setup
Make sure you have `yarn` and `virtualenv` installed.
```bash
git clone [email protected]:epicollect/epi-collect.git
cd epi-collect
yarn install
virtualenv --python=python3.6 venv
./venv/bin/activate
source ./venv/bin/activate
pip install -r requirements.txt
```
## Run for development
To start:
```bash
make run-dev
export PYTHONPATH="$PWD"
make run-db-local
```
To stop:
```bash
make stop-dev
make stop-db-local
```
If you want to test using the docker containers (which is closer to deployment):
```bash
make build-docker
make run-docker-local
```
## Local and deployment structure
The frontend is built in React with TypeScript.
We use React Bootstrap for the UI.
The backend is built using Flask and uses GeoAlchemy (GIS extension on top of SQLalchemy) to communicate with a PostGIS
database for persistent storage.
### Local
Locally you can run in two ways:
1. Using `yarn` and `flask` (`make start-dev`), in which case all traffic on `/api` is routed to `flask`.
In this setup, `make run-db-local` will spin up a local PostGIS instance with the correct schema.
2. Using `docker-compose` in which case the same docker containers as in the actual deployment are created,
but they are span up locally using `docker-compose`. The database doesn't work in this setup.
### Testing
#### Manual
A Google Takeout zip file with location data is located unter `tests/data/sample_location_history.zip`.
#### Automatic
See `tests/test_api.py`.
### Deployment
We deploy using `make deploy` (you need AWS access for this) which builds the following docker containers:
1. `nginx` container to serve the frontend React app.
2. `gunicorn` container to serve the Python backend.
These are pushed to Docker Hub. We then deploy this to AWS Elastic Beanstalk, where we have a `nginx` reverse proxy
behind AWS' load balancer, which routes all traffic on `/api` to the `gunicorn` container and all other traffic to the
frontend `nginx` container.
There is also a PostGIS database running in AWS RDS (Postgres with PostGIS extensions enabled).