https://github.com/alireza-py/yolodatahelper
A Python tool for managing YOLO datasets, including YOLOv5, YOLOv8, YOLOv11 and other Ultralytics-supported models. It streamlines tasks like dataset combination, data augmentation, class removal, and annotation visualization supports bounding box and segmentation formats, making it an essential tool for developers and researchers.
https://github.com/alireza-py/yolodatahelper
annotation-visualization data-augmentation dataset-combinations opencv ultralytics yolo-dataset yolov11 yolov5 yolov8
Last synced: about 1 month ago
JSON representation
A Python tool for managing YOLO datasets, including YOLOv5, YOLOv8, YOLOv11 and other Ultralytics-supported models. It streamlines tasks like dataset combination, data augmentation, class removal, and annotation visualization supports bounding box and segmentation formats, making it an essential tool for developers and researchers.
- Host: GitHub
- URL: https://github.com/alireza-py/yolodatahelper
- Owner: alireza-py
- Created: 2024-12-29T17:13:30.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-01-22T19:35:05.000Z (4 months ago)
- Last Synced: 2025-01-22T20:27:59.976Z (4 months ago)
- Topics: annotation-visualization, data-augmentation, dataset-combinations, opencv, ultralytics, yolo-dataset, yolov11, yolov5, yolov8
- Language: Python
- Homepage:
- Size: 5.92 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
#
Welcome to YoloDataHelper π‘
**YoloDataHelper** is a small Python utility to process YOLO(you only look once) datasets. This is a utility tool for merging datasets, augmenting data, removing classes, visualizing annotations, and other operations that make working with YOLO datasets easier by developers and researchers.
## π οΈ Features //
### 1. **Dataset Combination**
- Combine multiple YOLO datasets while properly aligning classes and adjusting label IDs.
- Retain the original structure of the datasets and generate a unified `data.yaml` file.### 2. **Data Augmentation**
- Apply various transformations to YOLO dataset images, such as:
- Hue, saturation, and brightness adjustments.
- Contrast enhancement.
- Adding random noise.
- Color jittering.
- Generate augmented images with updated labels.### 3. **Class Removal**
- Remove specific classes from the dataset and their associated images and labels.
- Automatically adjust class IDs and update the `data.yaml` file accordingly.### 4. **Annotation Visualization**
- Display bounding boxes or segmentation masks over images for easy verification.
- Save annotated images to a specified output directory.### 5. **Classes Equalization**
- Balance the number of images per class to ensure a uniform distribution.
- Adjust the dataset to prevent class imbalance issues.### 6. **Dataset Validation**
- Ensure the presence of the necessary directories (`train`, `valid`, `test`) and their subfolders (`images`, `labels`).
- Automatically create any missing directories if they donβt exist.### 7. **Resize Options**
- Compression: Resize with compressing images
- Advanced_compression: Resize with advanced compressing images
- Crop: Resize with cropping images
- Advanced_crop: Resize with advanced cropping images---
## π¦ Installation //
### Clone the Repository
To get started, first clone the repository:
```bash
git clone https://github.com/alireza-py/YoloDataHelper.git
cd YoloDataHelper
```
### Install Dependencies
Install the necessary dependencies using pip:
```bash
pip install -r requirements.txt
```
## π How to Use //
**Run Directly**
To use the tool as a standalone application, simply run the `main.py` file:
```bash
python main.py
```
### 1. Dataset Combination
Combine multiple YOLO datasets into a unified dataset:
```python
from YoloDatasetsTools import DatasetProcessor
datasets = ["path/to/dataset1", "path/to/dataset2"]
output_path = "path/to/combined_dataset"
processor = DatasetProcessor(output_path)
processor.combine_datasets(datasets)
```
### 2. Data Augmentation
Apply data augmentation to a dataset:
```python
from YoloDatasetsTools import DatasetProcessoroutput_path = "path/to/augmented_dataset"
augmentation_params = {
'hue': (-10, 10),
'saturation': (0.7, 1.3),
'brightness': (0.7, 1.3),
'contrast': (0.8, 1.2),
'noise': (10, 50),
'color_jitter': (0.9, 1.1)
}
processor = DatasetProcessor(output_path, augmentation_params=augmentation_params, multiplier=3)processor.process_folder(input_folder="path/to/dataset")
```
### 4. Annotation Visualization
Visualize bounding boxes or segmentation masks:
```python
from YoloDatasetsTools import DatasetProcessoroutput_path = "path/to/visualized_dataset"
processor = DatasetProcessor(output_path)processor.visualize_annotations(dataset_folder="path/to/dataset")
```
### 5. Classes Equalization
```python
from YoloDatasetsTools import DatasetProcessorcleaner = DatasetCleaner(dataset_folder="path/to/dataset")
cleaner.classes_equalization(subset=["train", "valid", "test"])
```
### 6. Directory Validation
Ensure required directories (train, valid, test) and their subfolders exist:
```python
from YoloDatasetsTools import DatasetProcessordataset_path = "path/to/dataset"
processor = DatasetProcessor(dataset_path)processor.ensure_dataset(dataset_path)
```
### 7. Resize Options
```python
from YoloDatasetsTools import DatasetProcessordataset_path = "path/to/dataset"
output_path = "path/to/output_dataset"
size = (720, 340)
mode = "advance_crop"processor = DatasetProcessor(dataset_path)
processor.process_resize_and_crop(
dataset_path,
output_path,
size,
mode
)
```
## π Directory Structure //
This tool assumes the following directory structure for YOLO datasets:
```python
dataset/
βββ train/
β βββ images/
β βββ labels/
βββ valid/
β βββ images/
β βββ labels/
βββ test/
β βββ images/
β βββ labels/
βββ data.yaml
```
The data.yaml file should include:
- train, val, and test: Paths to the respective datasets.
- nc: The number of classes.
- names: A list of class names.## π₯ Contributing //
Contributions are welcome! If you'd like to contribute to YoloDataHelper, you can:
- Fork the repository.
- Create a new branch for your feature or bug fix.
- Commit your changes and push the branch.
- Open a pull request with a description of the changes.
- If you encounter any issues, feel free to open an issue in the repository.