An open API service indexing awesome lists of open source software.

https://github.com/dev-vivekkumarverma/pyspark-databricks

spark, databricks, kafka, batch and stream-processing
https://github.com/dev-vivekkumarverma/pyspark-databricks

airflow batch-processing csv databricks delta-tables distributed-computing etl-pipeline file-formats json kafka medallion-architecture parquet pyspark python3 s3 spark stream-processing unity-catalog watermarking

Last synced: 28 days ago
JSON representation

spark, databricks, kafka, batch and stream-processing

Awesome Lists containing this project

README

          

# pyspark-databricks

`topics covered`
- pyspark
- databricks
- kafka
- csv, json, parquet, delta, text
- batch data processing
- stream-processing
- s3, ADLS-v2
- unity catalog
- watermarking
- query optimization
- AQE ( adaptive query execution )