https://github.com/biglocalnews/local-election-results-etl
Extract, transform and load election results posted online by local U.S. election officials
https://github.com/biglocalnews/local-election-results-etl
data-journalism elections elections-data etl journalism news python s3
Last synced: 3 days ago
JSON representation
Extract, transform and load election results posted online by local U.S. election officials
- Host: GitHub
- URL: https://github.com/biglocalnews/local-election-results-etl
- Owner: biglocalnews
- License: mit
- Created: 2022-10-13T12:21:13.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2022-12-07T19:06:00.000Z (almost 3 years ago)
- Last Synced: 2025-04-30T18:02:43.158Z (6 months ago)
- Topics: data-journalism, elections, elections-data, etl, journalism, news, python, s3
- Language: Jupyter Notebook
- Homepage:
- Size: 9.89 MB
- Stars: 2
- Watchers: 7
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
Extract, transform and load election results posted online by local U.S. election officials.
## Supports
- California Secretary of State
- Los Angeles County Registrar-Recorder/County Clerk
- New York State Board of Elections
- Iowa Secretary of State
## Latest files
File | S3 URL | Pages URL |
:--- | :----- | :--------
California Secretary of State | | [ca_secretary_of_state/latest.json](https://biglocalnews.github.io/local-election-results-etl/transformed/ca_secretary_of_state/latest.json)
Los Angeles County | | [los_angeles_county/latest.json](https://biglocalnews.github.io/local-election-results-etl/transformed/los_angeles_county/latest.json)
KPCC | [kpcc/latest.json](https://mt-legacy-projects.s3.amazonaws.com/vgp-general-election-results-2022/data/optimized/kpcc/latest.json) | [kpcc/latest.json](https://biglocalnews.github.io/local-election-results-etl/optimized/kpcc/latest.json)
## Getting started
Clone the repository and move into your code directory. Install the Python dependencies.
```bash
pipenv install --dev
```
Install [pre-commit](https://pre-commit.com/) hooks.
```bash
pipenv run pre-commit install
```
Create a `.env` file and fill it with the following Amazon Web Services services, which will authorize you to upload your files to an S3 bucket.
```
AWS_ACCESS_KEY_ID=
AWS_SECRET_ACCESS_KEY=
AWS_DEFAULT_REGION=
AWS_BUCKET=
```
If you want a common prefix on all objects uploaded to your bucket, add this optional variable.
```
AWS_PATH_PREFIX=your-prefix/
```
## Command pipeline
Download the raw data from the source websites.
```bash
pipenv run python -m src.los_angeles_county.download
pipenv run python -m src.ca_secretary_of_state.download
```
Transform the data into something we want to publish.
```bash
pipenv run python -m src.los_angeles_county.transform
pipenv run python -m src.ca_secretary_of_state.transform
```
Merge the common files
```bash
pipenv run python -m src.optimize kpcc
```
Export results to CSV.
```bash
pipenv run python -m src.export
```
Upload data to Amazon S3.
```bash
pipenv run python -m src.upload kpcc
```