https://github.com/raideno/data-augmentation
Python package to easily augment your existing datasets in memory by applying transformations.
https://github.com/raideno/data-augmentation
Last synced: 8 months ago
JSON representation
Python package to easily augment your existing datasets in memory by applying transformations.
- Host: GitHub
- URL: https://github.com/raideno/data-augmentation
- Owner: raideno
- Created: 2025-03-09T20:48:25.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-09T21:51:40.000Z (over 1 year ago)
- Last Synced: 2025-08-03T06:25:39.398Z (10 months ago)
- Language: Python
- Homepage:
- Size: 6.84 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
> 🚨 **WARNING: This package is still under development and NOT ready for production use!** 🚨
# AugmentDataset
`AugmentDataset` is a Python class that extends a dataset by adding augmented versions of selected samples. It allows you to apply random data augmentations on a dataset with control over the number of transformations, the probability of augmentation, and the number of augmented samples generated from each original sample.
## Installation
```bash
pip install git+https://github.com/raideno/data-augmentation.git
```
## Usage
### Example
```python
from data_augmentation import AugmentDataset
# Define your dataset (e.g., a PyTorch Dataset or a custom dataset)
class MyDataset:
def __getitem__(self, index):
# Return your data sample here
pass
def __len__(self):
# Return the length of your dataset
pass
# Create your dataset object
dataset = MyDataset()
# Define your augmentations (e.g., simple functions like flipping, rotation, etc.)
def rotate(sample):
# Rotate the sample by 90 degrees
return sample
def flip(sample):
# Flip the sample horizontally
return sample
transforms = [rotate, flip]
# Define your AugmentDataset
augment_dataset = AugmentDataset(
dataset=dataset,
probability_to_augment=0.5, # 50% chance of augmentation per sample
transforms=transforms, # List of transformations to apply
probabilities=[0.7, 0.3], # Probabilities for each transformation
max_transforms_per_sample=2, # Apply a maximum of 2 transformations per sample
augmentations_per_sample=3 # Create 3 augmented versions per sample
)
# Access original or augmented samples
sample = augment_dataset[0] # Get the first sample (original or augmented)
```