https://github.com/simeonhristov99/kickstarter
Course project for "Data Mining" university course.
https://github.com/simeonhristov99/kickstarter
catboost classification machine-learning pandas regression
Last synced: 2 months ago
JSON representation
Course project for "Data Mining" university course.
- Host: GitHub
- URL: https://github.com/simeonhristov99/kickstarter
- Owner: SimeonHristov99
- License: mit
- Created: 2023-06-19T14:50:05.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2023-06-29T11:16:02.000Z (about 3 years ago)
- Last Synced: 2025-06-18T17:50:44.763Z (about 1 year ago)
- Topics: catboost, classification, machine-learning, pandas, regression
- Language: Jupyter Notebook
- Homepage:
- Size: 177 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Kickstarter
## Goal
Course project for "Data Mining" university course.
## Data
- [Kaggle link](https://www.kaggle.com/datasets/iamsajanbhagat/kickstarter)
## Plan of Attack
- [X] Load data.
- [X] Data audit and information value analysis.
- [X] Feature Engineer: use date columns to make durations (in load data notebook).
- [X] Finish univariate analysis.
- [X] PyCaret without text features.
- [X] Move missing value handling to first notebook.
- [X] Restructure code.
- [X] Save PyCaret models plots with performance analysis.
- [X] Make regressor for number of backers.
- [X] Add chi-sq.
- [X] Make regressor for log transform number of backers.
- [X] Evaluate model.
- [X] Use regressor and classifier on test set.
- [X] Experiment with a shallow (non-recurrent) neural network.
## Resources
- [Predicting the success of crowdfunding](https://cs230.stanford.edu/projects_spring_2018/reports/8289614.pdf)
- [PyCaret Classification API](https://pycaret.readthedocs.io/en/stable/api/classification.html)
- [PyCaret Regression API](https://pycaret.readthedocs.io/en/stable/api/regression.html)
- [PyCaret Workflow](https://towardsdatascience.com/introduction-to-binary-classification-with-pycaret-a37b3e89ad8d)
- [Box-Cox Transform](https://towardsdatascience.com/top-3-methods-for-handling-skewed-data-1334e0debf45)