https://github.com/paladini/pyspark-connector-kairosdb

Easily query data on KairosDB and make it available as DataFrame on Apache Spark.
https://github.com/paladini/pyspark-connector-kairosdb

Last synced: 2 months ago
JSON representation

Easily query data on KairosDB and make it available as DataFrame on Apache Spark.

Host: GitHub
URL: https://github.com/paladini/pyspark-connector-kairosdb
Owner: paladini
License: mit
Created: 2015-10-13T16:08:13.000Z (almost 10 years ago)
Default Branch: master
Last Pushed: 2015-10-13T16:08:33.000Z (almost 10 years ago)
Last Synced: 2024-10-25T16:29:15.584Z (12 months ago)
Language: Python
Size: 105 KB
Stars: 4
Watchers: 3
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

          pySpark-connector-kairosdb

==========================

pySpark-connector-kairosdb provides you an easy way to get data from KairosDB and make it available on Spark as a DataFrame. It's simple as that:

```python

    #!/usr/bin/env python

    # sconnk means "py(S)park-(CONN)ector-(K)airosdb"

    from sconnk import Connection, Dataframe

    query_data = {

			"start_relative": { "value": "5", "unit": "years" },

			"metrics": [{ "name": "test_", "limit": 5 },

						{ "name": "DP_058424", "limit": 10 },

						{ "name": "teste_gzip", "limit": 5 },

						{ "name": "DP_063321", "limit": 10 }]

	}

	# Creating a connection with KairosDB database (in the given address).

	conn = Connection("http://localhost:8080/")

	# Performing our query on KairosDB.

	json_data = conn.query(query_data)

	# Creating a new Dataframe object passing the JSON returned by KairosDB API.

	df = Dataframe(json_data).df

	# Print the dataframe.

	df.show()

```

Remember this's a ALPHA module without good documentation, examples and even well implemented features. We've a long highway to cross.

Future

=========

This module is in development and we've the following plannings for the future of this module:

* Write good documentation

* Write tests - a lot of them.

* Add support to RDD

* Don't write to JSON file in order to parse it in Spark.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/paladini/pyspark-connector-kairosdb

Awesome Lists containing this project

README