{"id":26376569,"url":"https://github.com/poacosta/image-bulk-tinyfier","last_synced_at":"2025-03-17T03:18:05.256Z","repository":{"id":282094078,"uuid":"947469832","full_name":"poacosta/image-bulk-tinyfier","owner":"poacosta","description":"A high-performance Python utility for parallel image processing at scale. Efficiently resizes and optimizes thousands of images while preserving directory structures, with real-time progress tracking and comprehensive analytics.","archived":false,"fork":false,"pushed_at":"2025-03-12T18:35:06.000Z","size":9,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-12T19:33:27.978Z","etag":null,"topics":["csv-reading","files-folders","image-processing","pillow","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/poacosta.png","metadata":{"files":{"readme":"README.MD","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-12T18:29:18.000Z","updated_at":"2025-03-12T18:36:50.000Z","dependencies_parsed_at":"2025-03-12T19:33:31.039Z","dependency_job_id":"fcd63c35-d0a8-474d-8985-2114f9242364","html_url":"https://github.com/poacosta/image-bulk-tinyfier","commit_stats":null,"previous_names":["poacosta/image-bulk-tinyfier"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poacosta%2Fimage-bulk-tinyfier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poacosta%2Fimage-bulk-tinyfier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poacosta%2Fimage-bulk-tinyfier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/poacosta%2Fimage-bulk-tinyfier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/poacosta","download_url":"https://codeload.github.com/poacosta/image-bulk-tinyfier/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243965764,"owners_count":20375920,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv-reading","files-folders","image-processing","pillow","python"],"created_at":"2025-03-17T03:18:04.757Z","updated_at":"2025-03-17T03:18:05.251Z","avatar_url":"https://github.com/poacosta.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🖼️ image-bulk-tinyfier\n\n## A Nimble Solution for Mass Image Processing\n\nThis Python script tackles the surprisingly complex challenge of batch processing thousands of images while preserving\ndirectory structures. With parallel processing capabilities and comprehensive logging, it transforms what would be days\nof manual work into a streamlined, automated workflow.\n\n## Core Functionality\n\nAt its essence, image-bulk-tinyfier delivers four crucial capabilities:\n\n- Consumes a CSV file containing relative image paths\n- Processes each image with customizable parameters (resize, optimize, format conversion)\n- Faithfully recreates directory hierarchies at the destination\n- Provides detailed analytics on processing outcomes and efficiency gains\n\nWhat makes it particularly valuable is the balance between simplicity and power – a single-dependency tool that scales\nfrom modest batches to enterprise-level image libraries.\n\n## The Feedback Experience\n\nWhat sets this tool apart is its communication layer. Rather than the traditional black-box approach of many processing\nutilities, image-bulk-tinyfier keeps you informed with:\n\n```\n[=================\u003e                   ] 42% 420/1000 | 35.2 img/s | 12m elapsed | ETA: 16m\n```\n\nAfter completion, you're presented with an actionable summary:\n\n```\n================================================================================\nPROCESSING SUMMARY\n================================================================================\nTotal images processed:     1000\nSuccessfully processed:     998 (99.8%)\nFailed:                     2 (0.2%)\nTotal processing time:      432.1 seconds\nAverage processing speed:   2.31 images/second\n\nOriginal size:              1256.32 MB\nProcessed size:             312.57 MB\nStorage saved:              943.75 MB (75.1%)\n\nFAILED IMAGES:\n--------------------------------------------------------------------------------\n1. path/to/corrupted_image.jpg: Error processing path/to/corrupted_image.jpg: cannot identify image file\n2. another/path/broken.png: Error processing another/path/broken.png: broken data stream when reading image file\n\nCheck the log file for complete details: processing_log.json\n```\n\n## Technical Requirements \u0026 Setup\n\n- Python 3.8+\n- Pillow library (the only dependency)\n\n```bash\n# Clone the repository\ngit clone https://github.com/poacosta/image-bulk-tinyfier.git\ncd image-bulk-tinyfier\n\n# Set up a virtual environment (recommended)\npython -m venv venv\nsource venv/bin/activate  # On Windows: venv\\Scripts\\activate\n\n# Install dependency\npip install pillow\n```\n\n## Command Options\n\nThe tool offers a range of parameters to customize your processing pipeline:\n\n| Option          | Purpose                        | Default             |\n|-----------------|--------------------------------|---------------------|\n| `--csv`         | CSV with image paths           | *Required*          |\n| `--source`      | Original image location        | *Required*          |\n| `--dest`        | Processed image destination    | *Required*          |\n| `--workers`     | Parallel processing threads    | 8                   |\n| `--max-width`   | Maximum width after resizing   | 600                 |\n| `--max-height`  | Maximum height after resizing  | 600                 |\n| `--quality`     | JPEG quality (0-100)           | 80                  |\n| `--log-file`    | Processing log path            | processing_log.json |\n| `--no-progress` | Disable progress visualization | False               |\n| `--debug`       | Enable verbose logging         | False               |\n\n## Performance Considerations\n\nThe tool's architecture is designed for flexibility across varied computing environments:\n\n- **CPU Optimization**: Thread count automatically scales to your system's capabilities\n- **Memory Efficiency**: Processes images in controlled batches to avoid memory pressure\n- **Storage Awareness**: Optimized for both HDD and SSD configurations\n- **Scale Testing**: Successfully deployed on datasets exceeding 400,000 images\n\n## Technical Implementation\n\nWhat might appear simple on the surface is backed by thoughtful engineering:\n\n- Concurrent processing via ThreadPoolExecutor\n- Comprehensive exception handling for robust operation\n- Type-annotated codebase maintaining 9.5+ Pylint score\n- Modular architecture with clear separation of concerns\n\nThe JSON log output provides a complete audit trail that can be parsed for integration with other tools or reporting\nsystems.\n\n## CSV Input Format\n\nThe CSV format is intentionally minimalist – one relative path per line:\n\n```\nproduct/primary/item5562.jpg\nmarketing/banners/summer_promo.jpg\nuser/avatars/default.png\n```\n\n## License\n\nMIT License - See [LICENSE](LICENSE) for details.\n\n---\n\n*Built by someone who briefly contemplated manually resizing 400,000 images before coming to their senses. The\nstructured logging emerged from the realization that knowing what happened is often as important as making it happen.\nSometimes the most practical tools come from staring at a mountain of work and thinking \"there must be a better way.\"*","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoacosta%2Fimage-bulk-tinyfier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpoacosta%2Fimage-bulk-tinyfier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpoacosta%2Fimage-bulk-tinyfier/lists"}