Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kirbs-/covid-19-dataset
US county level COVID-19 case data.
https://github.com/kirbs-/covid-19-dataset
covid-19 covid19-data
Last synced: about 1 month ago
JSON representation
US county level COVID-19 case data.
- Host: GitHub
- URL: https://github.com/kirbs-/covid-19-dataset
- Owner: kirbs-
- License: mit
- Created: 2020-03-23T13:01:35.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2020-03-26T13:00:27.000Z (almost 5 years ago)
- Last Synced: 2024-10-14T04:12:30.496Z (3 months ago)
- Topics: covid-19, covid19-data
- Language: HTML
- Size: 7.14 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# covid-19-dataset
US county level COVID-19 case data.Daily snapshots of US cases by county.
## County Data Status
| State | Scraper | Validator | Aggergator | Time Series |
|-------|---------|-----------|------------|-------------|
| AK | Y | N | N | N |
| AL | Y | N | N | N |
| CA | Y | N | N | N |
| CO | Y | N | N | N |
| DE | Y | N | N | N |
| FL | Y | N | N | N |
| GA | Y | N | N | N |
| IA | Y | N | N | N |
| KS | Y | N | N | N |
| KY | Y | N | N | N |
| LA | Y | N | N | N |
| MD | Y | N | N | N |
| ME | Y | N | N | N |
| MI | Y | N | N | N |
| MO | Y | N | N | N |
| MN | Y | N | N | N |
| MT | Y | N | N | N |
| NJ | Y | N | N | N |
| NY | Y | N | N | N |
| OH | Y | N | N | N |
| PA | Y | N | N | N |
| TN | Y | N | N | N |
| TX | Y | N | N | N |
| VA | Y | N | N | N |
| WA | Y | N | N | N |
| WY | Y | N | N | N |## Project structure
```
/data # county level snapshots by scrape timestamp.
|
- {state}_by_county_{scraper_timestamp_in_EDT}.txt # snapshot of scraped results as of timestamp.
/source_page_backup # backup of source pages by scrape timestamp.
|
- {state}_county_{scrape_timestamp}.html # backup of source page. Extension depends on data source.
- main.ipynb # triggers crawler
- config.yaml # shared scraper configurations
- {state}_by_county.ipynb # State specific scapers
```## Scraper Format
Scrapers are simple python scripts or jupyter notebooks that implement a fetch, save, and run method.
#### fetch()
_Returns_
- DataFrame containing positive cases by county.
- Source data - HTML page, etc.
Fetch is responsible for getting and processing a page into a Pandas DataFrame. Fetch must return a DataFrame must contain `county` and `positive_cases` columns (additional columns are fine) and a string containing the data source being scraped.#### save(df, source)
_Params:_
- df (DataFrame): DataFrame containing `county` and `positive_cases` columns (additional columns are fine)
- source (str): string containing the data source page that was scraped.Save handles persisting the Data Frame and source data. df is saved as a pipe delimited text file in the data directory with the scraping timestamp in EDT.
#### run()
Handles fetch and save in one action. Used in main crawling job.