Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/danmayer/s3_image_converter
a project to learn some Clojure: it does conversion on large volumes of S3 images
https://github.com/danmayer/s3_image_converter
Last synced: 5 days ago
JSON representation
a project to learn some Clojure: it does conversion on large volumes of S3 images
- Host: GitHub
- URL: https://github.com/danmayer/s3_image_converter
- Owner: danmayer
- License: epl-1.0
- Created: 2016-11-13T17:18:31.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2018-05-17T02:46:40.000Z (over 6 years ago)
- Last Synced: 2025-01-05T23:16:40.993Z (19 days ago)
- Language: Clojure
- Homepage:
- Size: 906 KB
- Stars: 1
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# S3 Image Converter
A Clojure library designed to grab all images in S3 bucket and apply a transformation to them.
## Usage
Make sure you have your S3 credentials in `~/.aws/credentials`
`lein run`
with options:
`BUCKETNAME=mybucket lein run [cleanup, convert]`
clean: `BUCKETNAME=mybucket lein run --function clean --matcher _dan_test.jpg`
convert: `BUCKETNAME=mybucket lein run --function convert --matcher \.jpg`
convert pngs in directory: `BUCKETNAME=mybucket lein run --function convert --matcher \.png --prefix products`
## Running In Docker
* add your aws files to a gitignored directory
* copy from `~/.aws/` to `./deploy`
* `docker build -t danmayer/clojure-image-mapper:latest .`
* `docker run -it --rm --name running-image-mapper danmayer/clojure-image-mapper:latest`
To run with arguments and env variables:
`docker run -e BUCKETNAME=mybucket -it --rm --name running-image-mapper danmayer/clojure-image-mapper:latest lein run --function convert --matcher \.jpg`### dockerhub
* `docker login`
* `docker push danmayer/clojure-image-mapper`How to make your `Dockerfile` wrap this one with your credentials and do something useful, like say run the converter every 5min in a cron.
```
#~/my_project/depoy/image-cron
PATH=/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
*/5 * * * * cd /usr/src/app && BUCKETNAME=my_bucker lein run --function convert --matcher \.jpg --prefix products >> /var/log/cron.log 2>&1
``````
#~/my_project/Dockerfile
FROM danmayer/clojure-image-mapper:latest
MAINTAINER Dan Mayer# add credentials
ADD deploy/aws_credentials /root/.aws/credentials# Add crontab file in the cron directory
ADD deploy/image-cron /var/spool/cron/crontabs/root# Give execution rights on the cron job
RUN chmod 0644 /var/spool/cron/crontabs/root
# Create the log file to be able to run tail
RUN touch /var/log/cron.log
# Run the command on container startup
CMD crond -l 2 -f
```## Using webp-imageio
The Jar as well as macOS & Linux native library is included in the repo, if it is necessary to rebuild it, do the following:
```
brew install hg gradle cmake
hg clone https://bitbucket.org/luciad/webp-imageio
cd webp-imageio
gradle build -x test
cp ./build/libs/webp-imageio--SNAPSHOT.jar /resources/
# (update `:resource-paths` in project.clj)mkdir build
cd build
cmake ..
cmake --build .
cp src/main/c/* /native
```## Todo
* oneoff script to remove all webp or fix public read on existing ones?
* better way to handle AWS timeouts / throttling
* time based filter options
* more efficient S3 filter query (via CLI opposed to in app)
* don't write files work directly with IO-streams
* https://github.com/karls/collage/issues/8
* https://github.com/karls/collage/blob/master/src/fivetonine/collage/util.clj#L82
* https://github.com/weavejester/clj-aws-s3/blob/master/src/aws/sdk/s3.clj#L134
* contribute patch back to collage to make WebP work again
* https://github.com/karls/collage/blob/master/src/fivetonine/collage/util.clj#L35-L80
* https://bitbucket.org/luciad/webp-imageio/src/fde3644e6aa610f6a8d97c3d982a7c3926324ecf/src/javase/java/com/luciad/imageio/webp/WebPWriteParam.java?at=default&fileviewer=file-view-default#WebPWriteParam.java-28
* keep a count of how many images you converted
* see about maximizing perf
* increasing the buffers a lot to maybe 500 or 1000.
* I also misremembered the semantics of “pipeline-async” and I think you should try replacing them all with “pipeline-blocking” instead since all the IO is blocking.## License
Copyright © 2016 Dan Mayer and Ben Brinckerhoff
Distributed under the Eclipse Public License either version 1.0 or (at
your option) any later version.