https://github.com/poacosta/image-bulk-tinyfier

A high-performance Python utility for parallel image processing at scale. Efficiently resizes and optimizes thousands of images while preserving directory structures, with real-time progress tracking and comprehensive analytics.
https://github.com/poacosta/image-bulk-tinyfier

csv-reading files-folders image-processing pillow python

Last synced: over 1 year ago
JSON representation

Host: GitHub
URL: https://github.com/poacosta/image-bulk-tinyfier
Owner: poacosta
License: mit
Created: 2025-03-12T18:29:18.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-03-12T18:35:06.000Z (over 1 year ago)
Last Synced: 2025-03-12T19:33:27.978Z (over 1 year ago)
Topics: csv-reading, files-folders, image-processing, pillow, python
Language: Python
Homepage:
Size: 8.79 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.MD
- License: LICENSE

Awesome Lists containing this project

README

          # 🖼️ image-bulk-tinyfier

## A Nimble Solution for Mass Image Processing

This Python script tackles the surprisingly complex challenge of batch processing thousands of images while preserving

directory structures. With parallel processing capabilities and comprehensive logging, it transforms what would be days

of manual work into a streamlined, automated workflow.

## Core Functionality

At its essence, image-bulk-tinyfier delivers four crucial capabilities:

- Consumes a CSV file containing relative image paths

- Processes each image with customizable parameters (resize, optimize, format conversion)

- Faithfully recreates directory hierarchies at the destination

- Provides detailed analytics on processing outcomes and efficiency gains

What makes it particularly valuable is the balance between simplicity and power – a single-dependency tool that scales

from modest batches to enterprise-level image libraries.

## The Feedback Experience

What sets this tool apart is its communication layer. Rather than the traditional black-box approach of many processing

utilities, image-bulk-tinyfier keeps you informed with:

```

[=================>                   ] 42% 420/1000 | 35.2 img/s | 12m elapsed | ETA: 16m

```

After completion, you're presented with an actionable summary:

```

================================================================================

PROCESSING SUMMARY

================================================================================

Total images processed:     1000

Successfully processed:     998 (99.8%)

Failed:                     2 (0.2%)

Total processing time:      432.1 seconds

Average processing speed:   2.31 images/second

Original size:              1256.32 MB

Processed size:             312.57 MB

Storage saved:              943.75 MB (75.1%)

FAILED IMAGES:

--------------------------------------------------------------------------------

1. path/to/corrupted_image.jpg: Error processing path/to/corrupted_image.jpg: cannot identify image file

2. another/path/broken.png: Error processing another/path/broken.png: broken data stream when reading image file

Check the log file for complete details: processing_log.json

```

## Technical Requirements & Setup

- Python 3.8+

- Pillow library (the only dependency)

```bash

# Clone the repository

git clone https://github.com/poacosta/image-bulk-tinyfier.git

cd image-bulk-tinyfier

# Set up a virtual environment (recommended)

python -m venv venv

source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependency

pip install pillow

```

## Command Options

The tool offers a range of parameters to customize your processing pipeline:

| Option          | Purpose                        | Default             |

|-----------------|--------------------------------|---------------------|

| `--csv`         | CSV with image paths           | *Required*          |

| `--source`      | Original image location        | *Required*          |

| `--dest`        | Processed image destination    | *Required*          |

| `--workers`     | Parallel processing threads    | 8                   |

| `--max-width`   | Maximum width after resizing   | 600                 |

| `--max-height`  | Maximum height after resizing  | 600                 |

| `--quality`     | JPEG quality (0-100)           | 80                  |

| `--log-file`    | Processing log path            | processing_log.json |

| `--no-progress` | Disable progress visualization | False               |

| `--debug`       | Enable verbose logging         | False               |

## Performance Considerations

The tool's architecture is designed for flexibility across varied computing environments:

- **CPU Optimization**: Thread count automatically scales to your system's capabilities

- **Memory Efficiency**: Processes images in controlled batches to avoid memory pressure

- **Storage Awareness**: Optimized for both HDD and SSD configurations

- **Scale Testing**: Successfully deployed on datasets exceeding 400,000 images

## Technical Implementation

What might appear simple on the surface is backed by thoughtful engineering:

- Concurrent processing via ThreadPoolExecutor

- Comprehensive exception handling for robust operation

- Type-annotated codebase maintaining 9.5+ Pylint score

- Modular architecture with clear separation of concerns

The JSON log output provides a complete audit trail that can be parsed for integration with other tools or reporting

systems.

## CSV Input Format

The CSV format is intentionally minimalist – one relative path per line:

```

product/primary/item5562.jpg

marketing/banners/summer_promo.jpg

user/avatars/default.png

```

## License

MIT License - See [LICENSE](LICENSE) for details.

---

*Built by someone who briefly contemplated manually resizing 400,000 images before coming to their senses. The

structured logging emerged from the realization that knowing what happened is often as important as making it happen.

Sometimes the most practical tools come from staring at a mountain of work and thinking "there must be a better way."*

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/poacosta/image-bulk-tinyfier

Awesome Lists containing this project

README