Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/adgaudio/pyspark_pandas

Pyspark + pandas. This may get merged into the SparklingPandas project.
https://github.com/adgaudio/pyspark_pandas

Last synced: 15 days ago
JSON representation

Pyspark + pandas. This may get merged into the SparklingPandas project.

Awesome Lists containing this project

README

        

There is already an existing project,
[SparklingPandas](https://github.com/holdenk/sparklingpandas), that
integrates pandas and pyspark. You should look at that one as this
project may get merged into that one. This project aims to provide
useful tools and algorithms for distributing Pandas objects on Spark.

- Apache Spark is a fast and general engine for large-scale data
processing. It is written in Scala and also supports Python via
PySpark.

- Pandas is a library providing high-performance and easy-to-use data structures and data analysis tools for the Python programming language.

Did you check out
[SparklingPandas](https://github.com/holdenk/sparklingpandas)?