Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/brooksian/epaairnow
Exploring EPA Air Now Time Series Data with Apache Spark and Apache Zeppelin
https://github.com/brooksian/epaairnow
spark sparksql time-series zeppelin-notebook
Last synced: 22 days ago
JSON representation
Exploring EPA Air Now Time Series Data with Apache Spark and Apache Zeppelin
- Host: GitHub
- URL: https://github.com/brooksian/epaairnow
- Owner: BrooksIan
- Created: 2018-12-07T20:48:11.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2018-12-13T16:24:50.000Z (about 6 years ago)
- Last Synced: 2024-11-18T16:37:36.283Z (3 months ago)
- Topics: spark, sparksql, time-series, zeppelin-notebook
- Homepage:
- Size: 885 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Data Science In Apache Spark
### EPA Air Now
#### Exploring EPA Air Now Data**Language**: Scala
**Requirements**:
- [HDP 2.6.X]
- Spark 2.x**Author**: Ian Brooks
**Follow**: [LinkedIn Ian Brooks PhD](https://www.linkedin.com/in/ianrbrooksphd/)
**Source Data**: [EPA Air Now Search Site](https://aqs.epa.gov/api)
**Date Format**: [Oracle Date Format](https://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html)
**File Upload**: Upload the source data json files to HDFS in the /tmp directory
![AirNow](http://www.sonomatech.com/sites/default/files/Plain_large_T.png "EPA Air Now")