Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/johnsesana/pyspark-better-show
Small Function that displays Spark Dataframes "df.show()" in a better UI (similar to Pandas) in a notebook
https://github.com/johnsesana/pyspark-better-show
html jupyter-notebook pyspark
Last synced: 15 days ago
JSON representation
Small Function that displays Spark Dataframes "df.show()" in a better UI (similar to Pandas) in a notebook
- Host: GitHub
- URL: https://github.com/johnsesana/pyspark-better-show
- Owner: JohnSesana
- Created: 2024-10-30T13:44:27.000Z (17 days ago)
- Default Branch: main
- Last Pushed: 2024-10-30T13:51:12.000Z (17 days ago)
- Last Synced: 2024-10-30T14:43:39.249Z (17 days ago)
- Topics: html, jupyter-notebook, pyspark
- Language: Python
- Homepage:
- Size: 0 Bytes
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Better Show
Small Function that displays Sparks .show() in a better UI (similar to Pandas) in a notebook
## [Code](https://github.com/JohnSesana/Better-Spark-Show/blob/main/better_show.py)
```python
from IPython.display import HTMLdef better_show(df, num_rows=50):
"""
Display a PySpark DataFrame as an HTML table in Jupyter notebook.Parameters:
df (DataFrame): The PySpark DataFrame to display.
num_rows (int): Number of rows to display. Default is 50.
"""
# Collect the specified number of rows as a list of dictionaries
rows = df.limit(num_rows).collect()# Create an HTML table string with column headers
html = "" + "".join([f"{col}" for col in df.columns]) + ""# Add the rows to the table
for row in rows:
html += "" + "".join([f"{value}" for value in row]) + ""html += ""
# Display the HTML table
return HTML(html)
```## Usage Example
```python
# Your spark contextspark_df = spark.read.format("orc").load(your_table) # Change the format as you need
better_show(spark_df, 15) # By default shows 50 rows
```