https://github.com/andreazoccatelli/tabular_data_augmentation_continuous
This repository contains the scripts used to write my master degree thesis project: "Augmentation of tabular data with continuous features for binary imbalanced classification problems"
https://github.com/andreazoccatelli/tabular_data_augmentation_continuous
cgan copula data-augmentation imbalanced-classification imbalanced-data imbalanced-learning
Last synced: 7 months ago
JSON representation
This repository contains the scripts used to write my master degree thesis project: "Augmentation of tabular data with continuous features for binary imbalanced classification problems"
- Host: GitHub
- URL: https://github.com/andreazoccatelli/tabular_data_augmentation_continuous
- Owner: AndreaZoccatelli
- Created: 2023-07-01T10:39:21.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-07-09T16:24:17.000Z (about 2 years ago)
- Last Synced: 2025-01-17T18:25:52.295Z (9 months ago)
- Topics: cgan, copula, data-augmentation, imbalanced-classification, imbalanced-data, imbalanced-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 669 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Augmentation of tabular data with continuous features for binary imbalanced classification problems
The aim of this project is to augment the observations that belong to the minority class using copula sampling and conditional GANs in order to improve the performance of the classifiers for binary imbalanced classification problems.
- For the augmentation based on copulas, my library, GenCopula has been used.
``` r
library(devtools)
install_github("AndreaZoccatelli/GenCopula")
```
- The library used for the augmentation based on cGAN is CTGAN
- To re-create the datasets used in the project run Create_data.ipynb- These notebooks report the results obtained on the different dataset:
- Best case
- 20-30% Safe
- Less 20% Safe
- 10% Minority
- 5% Minority
- 4 Features
- 8 Features
- Default