https://github.com/ritzvik/oversampling-techniques

data machine-learning machine-learning-algorithms oversampling python r smote smote-ipf smoteipf spider spider2

Last synced: 5 months ago
JSON representation

Host: GitHub
URL: https://github.com/ritzvik/oversampling-techniques
Owner: ritzvik
Created: 2018-10-04T17:25:57.000Z (almost 7 years ago)
Default Branch: master
Last Pushed: 2019-02-01T14:59:18.000Z (over 6 years ago)
Last Synced: 2024-12-29T12:43:36.586Z (7 months ago)
Topics: data, machine-learning, machine-learning-algorithms, oversampling, python, r, smote, smote-ipf, smoteipf, spider, spider2
Language: Python
Size: 64.5 KB
Stars: 4
Watchers: 0
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# oversampling-techniques

## SMOTE
Input file should contain Yes/No Class and header. The header of Yes/No Column should be named 'Class'. Entries in Yes/No Column should be either 'Y' or 'N'. Input File should be in csv format.

Change k in KNN for SMOTE by changing the variable 'k_knn' in smote.r . Change 'YNcolumn' variable to indicate column number of Yes/No Column. Column numbers start from 1.

File Usage : rscript smote.r

## SMOTE-IPF
Input file should contain Yes/No Class and header. The header of Yes/No Column should be named 'Class'. Entries in Yes/No Column should be either 'Y' or 'N'. Input File should be in csv format.

Change k in KNN for SMOTE by changing the variable 'k_knn' in smoteipf.r . Change 'YNcolumn' variable to indicate column number of Yes/No Column. Column numbers start from 1.

Also, variable 'n', 'k', 'voting', 'p' can be changed accordingly. For information on these variables visit : https://www.sciencedirect.com/science/article/pii/S0020025514008561

File Usage : rscript smoteipf.r

## SPIDER2
Input file should contain Yes/No Class and header. The header of Yes/No Column should be named 'Class'. Entries in Yes/No Column should be either 'Y' or 'N'. Input File should be in csv format.

Change k in KNN changing the variable 'k' in spider.py . Change 'YNcolumn' variable to indicate column number of Yes/No Column. Column numbers start from 1.

Other variables can be changed below the comment "#change below parameters according to requirment". For info on these variables visit : https://link.springer.com/chapter/10.1007/978-3-642-13529-3_18

File Usage : python3 spider.py

Output : k-r-a0.csv, k-r-a1.csv, k-r-a2.csv where a0, a1 and a2 represent no aplification, weak amplification and strong amplification respectively.

#### A sample file named 'sample.csv' is uploaded for reference.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ritzvik/oversampling-techniques

Awesome Lists containing this project

README