An open API service indexing awesome lists of open source software.

https://github.com/farhad-here/image-dataset-splitter

A simple Streamlit app that splits a zipped image dataset into train, validation, and test folders automatically.
https://github.com/farhad-here/image-dataset-splitter

deep-learning io os python random shutil streamlit zipfile

Last synced: 14 days ago
JSON representation

A simple Streamlit app that splits a zipped image dataset into train, validation, and test folders automatically.

Awesome Lists containing this project

README

          

# ๐Ÿ“ Image Dataset Splitter

This Streamlit app allows you to split any labeled image dataset into `train`, `validation`, and `test` sets โ€” all with a single click. Simply upload a `.zip` file where each folder represents a class, and the app will generate a downloadable zip with the organized dataset structure.

---

## ๐Ÿš€ Features

โœ… Upload a `.zip` file with folders of images (each folder is treated as a class)
โœ… Automatically split into `train`, `val`, and `test` folders
โœ… Random shuffling of images for fair distribution
โœ… Download the final dataset as a ready-to-use `.zip`
โœ… Clean UI with Streamlit

---

## ๐Ÿ“‚ Expected Input Format

Your input `.zip` file should be structured like this:

Each folder is interpreted as a separate class label.

---

## ๐Ÿงพ Output Format

After splitting, the app generates a `.zip` file like this:

---

## โš™๏ธ Configuration

You can modify the train/val/test split ratios by editing this line in `app.py`:

```python
SPLIT_RATIO = (0.7, 0.15, 0.15)

git clone https://github.com/your-username/image-dataset-splitter.git
cd image-dataset-splitter
```

```python
pip install streamlit
streamlit run app.py
```
---

๐Ÿ“Œ Use Cases
- Preparing image datasets for machine learning / deep learning

- Organizing animal or object classification datasets

- Creating train/val/test splits without writing custom scripts