https://github.com/hsm207/grab-safety
My submission for Grab AI for S.E.A. challenge
https://github.com/hsm207/grab-safety
databricks databricks-notebooks pyspark spark-ml telematics
Last synced: 12 months ago
JSON representation
My submission for Grab AI for S.E.A. challenge
- Host: GitHub
- URL: https://github.com/hsm207/grab-safety
- Owner: hsm207
- License: gpl-3.0
- Created: 2019-06-03T15:31:45.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2019-06-03T15:31:53.000Z (about 7 years ago)
- Last Synced: 2025-06-01T12:45:59.749Z (about 1 year ago)
- Topics: databricks, databricks-notebooks, pyspark, spark-ml, telematics
- Language: Jupyter Notebook
- Size: 649 KB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Introduction
This repository contains my solution to the Grab AI for S.E.A. [Safety Challenge](https://www.aiforsea.com/safety).
# Prerequisites
The [notebooks](/notebooks) were developed on Databricks using Python 3 and PySpark running on the 5.4 ML
Beta (includes Apache Spark 2.4.0, Scala 2.11) runtime.
# Usage
Execute the notebooks in the following order and modify any directory and file names accordingly:
* [01_data.ipynb](/notebooks/01_data.ipynb): This notebook downloads the dataset and saves it in parquet format
* [02_EDA.ipynb](/notebooks/02_EDA.ipynb): This notebook does some exploratory data analysis and removes outliers
* [03_feature_engineering.ipynb](/notebooks/03_feature_engineering.ipynb): This notebook computes features that will be used to build
models to classify a trip
* [04_modelling.ipynb](/notebooks/04_modelling.ipynb): This notebook builds and validates a few models
* [05_evaluation.ipynb](/notebooks/05_evaluation.ipynb): This notebook is meant to evaluate the best model found in the previous
notebook against the hold-out dataset.
More details are available in each notebook.