Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/timgasser/acm_imbalanced_learning
Slides and code for the ACM Imbalanced Learning talk on 27th April 2016
https://github.com/timgasser/acm_imbalanced_learning
Last synced: about 2 months ago
JSON representation
Slides and code for the ACM Imbalanced Learning talk on 27th April 2016
- Host: GitHub
- URL: https://github.com/timgasser/acm_imbalanced_learning
- Owner: timgasser
- Created: 2016-04-27T03:18:30.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2016-04-27T23:47:04.000Z (over 8 years ago)
- Last Synced: 2024-02-13T05:05:29.493Z (11 months ago)
- Language: Jupyter Notebook
- Size: 22.8 MB
- Stars: 5
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-imbalanced-learning - acm_imbalanced_learning - slides and code for the ACM Imbalanced Learning talk on 27th April 2016 in Austin, TX. (3.2 Github Repositories / 3.2.3 *Slides*)
README
# acm_imbalanced_learning
This repo contains slides and code for the [ACM Imbalanced Learning talk](http://www.meetup.com/Austin-ACM-SIGKDD/events/230200840/) on 27th April 2016 in Austin, TX.
## File listing
The files in the repo are listed below, with an explanation of what they're used for.
* ```acm_imbalance_algorithms.ipynb``` - Jupyter notebook with scikit-learn classifiers training on the Kaggle dataset.
* ```acm_imbalance_datasets.{pdf, pptx}``` - Powerpoint presentation with explanation of the dataset processing and algorithms.
* ```acm_imbalance_sampling.ipynb``` - Jupyter notebook with a set of routines to pre-process imbalanced data.
* ```acm_imbalanced_dataset.R``` - R script to use the 'unbalanced' package to pre-process data to remove imbalance.
* ```datasets.zip``` - A zip file containing datasets for use in the talk. These are listed below
* ```cs-training.csv``` - Training data from the Kaggle 'Can I get some credit' competition
* ```cs-test.csv``` - Test data from the Kaggle competition.
* ```sampleEntry.csv``` - Sample entry format for the Kagle competition.
* ```cs-training-{CNN, OSS, smote, tomek}.csv``` - Processed training data (generated from ```acm_imbalanced_dataset.R```) using the algorithms in the filename.## Feedback
Any comments, questions, or feedback please submit a pull request !