Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/brooksian/epaairnow

Exploring EPA Air Now Time Series Data with Apache Spark and Apache Zeppelin
https://github.com/brooksian/epaairnow

spark sparksql time-series zeppelin-notebook

Last synced: 22 days ago
JSON representation

Exploring EPA Air Now Time Series Data with Apache Spark and Apache Zeppelin

Awesome Lists containing this project

README

        

## Data Science In Apache Spark
### EPA Air Now
#### Exploring EPA Air Now Data

**Language**: Scala
**Requirements**:
- [HDP 2.6.X]
- Spark 2.x

**Author**: Ian Brooks

**Follow**: [LinkedIn Ian Brooks PhD](https://www.linkedin.com/in/ianrbrooksphd/)

**Source Data**: [EPA Air Now Search Site](https://aqs.epa.gov/api)

**Date Format**: [Oracle Date Format](https://docs.oracle.com/javase/tutorial/i18n/format/simpleDateFormat.html)

**File Upload**: Upload the source data json files to HDFS in the /tmp directory

![AirNow](http://www.sonomatech.com/sites/default/files/Plain_large_T.png "EPA Air Now")