An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with gcp-dataproc

A curated list of projects in awesome lists tagged with gcp-dataproc .

https://github.com/tansudasli/spark-sandbox

Apache spark sandbox on GCP and Amazon EMR.

apache-spark aws-emr gcp-dataproc python

Last synced: 01 Mar 2025

https://github.com/snehadharne/bigdataanalytics-mvcollisions

Leveraging NYC Open Data, this repository contains Databricks notebooks for analyzing motor vehicle collisions. We perform EDA, spatial clustering, and predictive modeling on collision, vehicle, and person datasets to understand accident trends and predict potential risks.

data-wrangling eda gcp-dataproc mllib predictive-modeling pyspark pyspark-notebook scale-out scale-up spatial-data-analysis

Last synced: 02 Mar 2025