Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/adgaudio/pyspark_pandas
Pyspark + pandas. This may get merged into the SparklingPandas project.
https://github.com/adgaudio/pyspark_pandas
Last synced: 15 days ago
JSON representation
Pyspark + pandas. This may get merged into the SparklingPandas project.
- Host: GitHub
- URL: https://github.com/adgaudio/pyspark_pandas
- Owner: adgaudio
- Created: 2014-08-24T17:28:41.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2014-10-15T21:32:36.000Z (about 10 years ago)
- Last Synced: 2024-10-03T11:12:04.487Z (about 1 month ago)
- Language: Python
- Size: 258 KB
- Stars: 6
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README
Awesome Lists containing this project
README
There is already an existing project,
[SparklingPandas](https://github.com/holdenk/sparklingpandas), that
integrates pandas and pyspark. You should look at that one as this
project may get merged into that one. This project aims to provide
useful tools and algorithms for distributing Pandas objects on Spark.- Apache Spark is a fast and general engine for large-scale data
processing. It is written in Scala and also supports Python via
PySpark.- Pandas is a library providing high-performance and easy-to-use data structures and data analysis tools for the Python programming language.
Did you check out
[SparklingPandas](https://github.com/holdenk/sparklingpandas)?