https://github.com/paladini/pyspark-connector-kairosdb
Easily query data on KairosDB and make it available as DataFrame on Apache Spark.
https://github.com/paladini/pyspark-connector-kairosdb
Last synced: 2 months ago
JSON representation
Easily query data on KairosDB and make it available as DataFrame on Apache Spark.
- Host: GitHub
- URL: https://github.com/paladini/pyspark-connector-kairosdb
- Owner: paladini
- License: mit
- Created: 2015-10-13T16:08:13.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2015-10-13T16:08:33.000Z (almost 10 years ago)
- Last Synced: 2024-10-25T16:29:15.584Z (12 months ago)
- Language: Python
- Size: 105 KB
- Stars: 4
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
pySpark-connector-kairosdb
==========================pySpark-connector-kairosdb provides you an easy way to get data from KairosDB and make it available on Spark as a DataFrame. It's simple as that:
```python
#!/usr/bin/env python# sconnk means "py(S)park-(CONN)ector-(K)airosdb"
from sconnk import Connection, Dataframequery_data = {
"start_relative": { "value": "5", "unit": "years" },
"metrics": [{ "name": "test_", "limit": 5 },
{ "name": "DP_058424", "limit": 10 },
{ "name": "teste_gzip", "limit": 5 },
{ "name": "DP_063321", "limit": 10 }]
}# Creating a connection with KairosDB database (in the given address).
conn = Connection("http://localhost:8080/")# Performing our query on KairosDB.
json_data = conn.query(query_data)# Creating a new Dataframe object passing the JSON returned by KairosDB API.
df = Dataframe(json_data).df# Print the dataframe.
df.show()
```Remember this's a ALPHA module without good documentation, examples and even well implemented features. We've a long highway to cross.
Future
=========This module is in development and we've the following plannings for the future of this module:
* Write good documentation
* Write tests - a lot of them.
* Add support to RDD
* Don't write to JSON file in order to parse it in Spark.