Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jianninapinto/bandersnatch
This project implements a machine learning model using Random Forest, XGBoost, and Support Vector Machines algorithms with oversampling and undersampling techniques to handle imbalanced classes for classification tasks in the context of predicting the rarity of monsters.
https://github.com/jianninapinto/bandersnatch
altair imbalanced-classification imblearn machine-learning mongodb oversampling pycharm-ide pymongo python random-forest-classifier scikit-learn smote support-vector-machines undersampling xgboost
Last synced: 12 days ago
JSON representation
This project implements a machine learning model using Random Forest, XGBoost, and Support Vector Machines algorithms with oversampling and undersampling techniques to handle imbalanced classes for classification tasks in the context of predicting the rarity of monsters.
- Host: GitHub
- URL: https://github.com/jianninapinto/bandersnatch
- Owner: jianninapinto
- Fork: true (BloomTech-Labs/BandersnatchStarter)
- Created: 2023-06-06T15:28:28.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-22T19:28:05.000Z (over 1 year ago)
- Last Synced: 2024-09-27T06:04:02.765Z (4 months ago)
- Topics: altair, imbalanced-classification, imblearn, machine-learning, mongodb, oversampling, pycharm-ide, pymongo, python, random-forest-classifier, scikit-learn, smote, support-vector-machines, undersampling, xgboost
- Language: Jupyter Notebook
- Homepage:
- Size: 581 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Bandersnatch Project
Read the Documentation for information on how to get started.
[Deployed App](https://bandersnatch.herokuapp.com)
### Tech Stack
- Logic: Python3
- API Framework: Flask
- Templates: Jinja2
- Structure: HTML5
- Styling: CSS3
- Database: MongoDB
- Graphs: Altair
- Machine Learning: Scikit
- Hosting: Heroku### Provided Code
- HTML Templates
- CSS Styles
- API Framework
- Miscellaneous Helper Files
- Sprint Specific Documentation### Primary Features by URL
- `/`: Splash Page
- `/data`: Tabular Data
- `/view`: Dynamic Visualizations
- `/model`: Interactive Machine Learning Model### Primary Goals
For best results, complete each sprint in order, before going on to the next sprint.1. Sprint 1: Database Operations
- Develop a database interface class
- Create random data
- Populate the database with at least 1000 datapoints
2. Sprint 2: Dynamic Visualizations
- Notebook exploration
- Chart function
- API integration
3. Sprint 3: Machine Learning Model
- Notebook exploration
- Machine Learning interface class
- Model serialization (save and open)
- API model integration### Stretch Goals
- Use ElephantSQL instead of MongoDB
- Use Plotly instead of Altair
- Use PyTorch instead of Scikit
- Use FastAPI instead of Flask
- Add the ability for the user to reset & reseed the database
- Add the ability for the user to re-train the machine learning model
- Add the ability for the user to download a working serialized model and dataset
- Add authentication to sensitive pages
- Use a different set of features to train the model
- Use your own dataset entirely### OS Specific Notes: Gunicorn is not Windows compatible!
- Windows users should not use the `run.sh` shell script, as it depends on gunicorn.
- Windows users should use `py -m app.main` to start the app with Flask acting as the server.
- Mac and Linux users can use `./run.sh` script or type the command directly `python3 -m gunicorn app.main:APP`.
- Feel free to modify the shell scripts to suit your needs, these are intended to run locally.
- In any case you should not modify the Procfile, this is the run script for the remote server.