https://github.com/jianninapinto/bandersnatch

This project implements a machine learning model using Random Forest, XGBoost, and Support Vector Machines algorithms with oversampling and undersampling techniques to handle imbalanced classes for classification tasks in the context of predicting the rarity of monsters.
https://github.com/jianninapinto/bandersnatch

altair imbalanced-classification imblearn machine-learning mongodb oversampling pycharm-ide pymongo python random-forest-classifier scikit-learn smote support-vector-machines undersampling xgboost

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/jianninapinto/bandersnatch
Owner: jianninapinto
Fork: true (BloomTech-Labs/BandersnatchStarter)
Created: 2023-06-06T15:28:28.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2023-08-22T19:28:05.000Z (over 1 year ago)
Last Synced: 2024-09-27T06:04:02.765Z (7 months ago)
Topics: altair, imbalanced-classification, imblearn, machine-learning, mongodb, oversampling, pycharm-ide, pymongo, python, random-forest-classifier, scikit-learn, smote, support-vector-machines, undersampling, xgboost
Language: Jupyter Notebook
Homepage:
Size: 581 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Bandersnatch Project

Read the Documentation for information on how to get started.

[Deployed App](https://bandersnatch.herokuapp.com)

### Tech Stack
- Logic: Python3
- API Framework: Flask
- Templates: Jinja2
- Structure: HTML5
- Styling: CSS3
- Database: MongoDB
- Graphs: Altair
- Machine Learning: Scikit
- Hosting: Heroku

### Provided Code
- HTML Templates
- CSS Styles
- API Framework
- Miscellaneous Helper Files
- Sprint Specific Documentation

### Primary Features by URL
- `/`: Splash Page
- `/data`: Tabular Data
- `/view`: Dynamic Visualizations
- `/model`: Interactive Machine Learning Model

### Primary Goals
For best results, complete each sprint in order, before going on to the next sprint.

1. Sprint 1: Database Operations
- Develop a database interface class
- Create random data
- Populate the database with at least 1000 datapoints
2. Sprint 2: Dynamic Visualizations
- Notebook exploration
- Chart function
- API integration
3. Sprint 3: Machine Learning Model
- Notebook exploration
- Machine Learning interface class
- Model serialization (save and open)
- API model integration

### Stretch Goals
- Use ElephantSQL instead of MongoDB
- Use Plotly instead of Altair
- Use PyTorch instead of Scikit
- Use FastAPI instead of Flask
- Add the ability for the user to reset & reseed the database
- Add the ability for the user to re-train the machine learning model
- Add the ability for the user to download a working serialized model and dataset
- Add authentication to sensitive pages
- Use a different set of features to train the model
- Use your own dataset entirely

### OS Specific Notes: Gunicorn is not Windows compatible!
- Windows users should not use the `run.sh` shell script, as it depends on gunicorn.
- Windows users should use `py -m app.main` to start the app with Flask acting as the server.
- Mac and Linux users can use `./run.sh` script or type the command directly `python3 -m gunicorn app.main:APP`.
- Feel free to modify the shell scripts to suit your needs, these are intended to run locally.
- In any case you should not modify the Procfile, this is the run script for the remote server.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/jianninapinto/bandersnatch

Awesome Lists containing this project

README