https://github.com/raideno/data-augmentation

Python package to easily augment your existing datasets in memory by applying transformations.
https://github.com/raideno/data-augmentation

Last synced: 8 months ago
JSON representation

Python package to easily augment your existing datasets in memory by applying transformations.

Host: GitHub
URL: https://github.com/raideno/data-augmentation
Owner: raideno
Created: 2025-03-09T20:48:25.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-03-09T21:51:40.000Z (over 1 year ago)
Last Synced: 2025-08-03T06:25:39.398Z (10 months ago)
Language: Python
Homepage:
Size: 6.84 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          > 🚨 **WARNING: This package is still under development and NOT ready for production use!** 🚨

# AugmentDataset

`AugmentDataset` is a Python class that extends a dataset by adding augmented versions of selected samples. It allows you to apply random data augmentations on a dataset with control over the number of transformations, the probability of augmentation, and the number of augmented samples generated from each original sample.

## Installation

```bash

pip install git+https://github.com/raideno/data-augmentation.git

```

## Usage

### Example

```python

from data_augmentation import AugmentDataset

# Define your dataset (e.g., a PyTorch Dataset or a custom dataset)

class MyDataset:

    def __getitem__(self, index):

        # Return your data sample here

        pass

    def __len__(self):

        # Return the length of your dataset

        pass

# Create your dataset object

dataset = MyDataset()

# Define your augmentations (e.g., simple functions like flipping, rotation, etc.)

def rotate(sample):

    # Rotate the sample by 90 degrees

    return sample

def flip(sample):

    # Flip the sample horizontally

    return sample

transforms = [rotate, flip]

# Define your AugmentDataset

augment_dataset = AugmentDataset(

    dataset=dataset,

    probability_to_augment=0.5,    # 50% chance of augmentation per sample

    transforms=transforms,         # List of transformations to apply

    probabilities=[0.7, 0.3],      # Probabilities for each transformation

    max_transforms_per_sample=2,   # Apply a maximum of 2 transformations per sample

    augmentations_per_sample=3     # Create 3 augmented versions per sample

)

# Access original or augmented samples

sample = augment_dataset[0]  # Get the first sample (original or augmented)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/raideno/data-augmentation

Awesome Lists containing this project

README