{"id":19687144,"url":"https://github.com/trinitronx/redshift-mega-maid","last_synced_at":"2026-02-07T16:32:03.906Z","repository":{"id":136206338,"uuid":"52896702","full_name":"trinitronx/redshift-mega-maid","owner":"trinitronx","description":"Docker container \u0026 wrapper script for analyze-vacuum-schema from awslabs/amazon-redshift-utils","archived":false,"fork":false,"pushed_at":"2017-01-25T21:56:52.000Z","size":2194,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-23T02:46:08.768Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Makefile","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/trinitronx.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-03-01T17:40:55.000Z","updated_at":"2020-02-02T21:55:06.000Z","dependencies_parsed_at":null,"dependency_job_id":"8874c74a-04e0-48a5-983f-4e97508460a6","html_url":"https://github.com/trinitronx/redshift-mega-maid","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/trinitronx/redshift-mega-maid","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinitronx%2Fredshift-mega-maid","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinitronx%2Fredshift-mega-maid/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinitronx%2Fredshift-mega-maid/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinitronx%2Fredshift-mega-maid/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/trinitronx","download_url":"https://codeload.github.com/trinitronx/redshift-mega-maid/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/trinitronx%2Fredshift-mega-maid/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29199770,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-07T16:28:23.579Z","status":"ssl_error","status_checked_at":"2026-02-07T16:28:22.566Z","response_time":63,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-11T18:33:14.762Z","updated_at":"2026-02-07T16:32:03.884Z","avatar_url":"https://github.com/trinitronx.png","language":"Makefile","funding_links":[],"categories":[],"sub_categories":[],"readme":"RedShift Mega Maid\n==================\n[![GitHub stars](https://img.shields.io/github/stars/ReturnPath/redshift-mega-maid.svg?style=social\u0026label=Star)](https://github.com/ReturnPath/redshift-mega-maid)\n[![GitHub watchers](https://img.shields.io/github/watchers/ReturnPath/redshift-mega-maid.svg?style=social\u0026label=Watch)](https://github.com/ReturnPath/redshift-mega-maid)\n[![GitHub forks](https://img.shields.io/github/forks/ReturnPath/redshift-mega-maid.svg?style=social\u0026label=Fork)](https://github.com/ReturnPath/redshift-mega-maid)\n[![GitHub issues](https://img.shields.io/github/issues/ReturnPath/redshift-mega-maid.svg)](https://github.com/ReturnPath/redshift-mega-maid/issues)\n[![Docker Pulls](https://img.shields.io/docker/pulls/returnpath/redshift-mega-maid.svg)](https://hub.docker.com/r/returnpath/redshift-mega-maid)\n[![Docker Stars](https://img.shields.io/docker/stars/returnpath/redshift-mega-maid.svg)](https://hub.docker.com/r/returnpath/redshift-mega-maid)\n[![Image Layers](https://badge.imagelayers.io/returnpath/redshift-mega-maid:latest.svg)](https://imagelayers.io/?images=returnpath/redshift-mega-maid:latest 'Get your own badge on imagelayers.io')\n\n![RedShift Mega Maid](https://raw.githubusercontent.com/ReturnPath/redshift-mega-maid/master/mega-maid.gif)\n\nThis project is a simple Docker container built from [awslabs/amazon-redshift-utils][1].  It's purpose is to `VACUUM` and `ANALYZE` a RedShift Cluster as a background job.\n\n## Development\n\nThis project really depends on [awslabs/amazon-redshift-utils][1].  If you want to fix bugs \u0026 make changes to the script, fork that repo \u0026 send Pull Requests upstream!\n\n## Building\n\nThis container uses a single `Makefile` for ease of building \u0026 deployment.  It also uses a basic \"Build Tools\" container that includes the standard gcc + make toolchain.  This should make running in any given CI environment that supports Docker easy.\n\n## `make container-build`\n\nThis spins up a clean docker container, mounting in the project directory, and calls `make clean build`.\n\n## `make build`\n\nThis runs `docker build` to install the script's dependencies \u0026 build the container.\n\n## `make package`\n\nThis uses `Dockerfile` to build the releaseable docker container.\n\n## `make deploy`\n\nThis will execute `aws ecr get-login`, then push the container to the [EC2 Container Services Registry (ECR)](https://console.aws.amazon.com/ecs/home?region=us-east-1#/repositories).  You are responsible for setting up ECR in your own AWS account.\n\n# Deploy\n\nWe're tagging our images based on date and git hash, and also with `latest`.  The container should now be in the ECR (you can get the list of images in ECR with `aws ecr list-images --repository-name redshift-mega-maid`).\n\n# Running\n\nTODO: Kubernetes Deployment option\n\nThe container takes environment variables to pass to the `analyze-vacuum-schema.py` script as arguments.  **Most** are specified in `Dockerfile` as defaults but can be overridden.  As a general rule, the environment variable names follow the convention: `$MM_ARGUMENT_NAME`, where the cooresponding command line argument would be `--argument-name`.  Prefix `MM_` to the argument name, while replacing all dashes (`-`) with underscores (`_`).\n\nRequired environment variables are:\n\n - `MM_DB_NAME`\n - `MM_DB_USER`\n - `MM_DB_PASS`\n - `MM_DB_HOST`\n - `MM_DB_SCHEMA`\n - `MM_DB_TABLE`\n\nYou may run the docker container with:\n\n    docker run --rm -e MM_DB_NAME=your_db_name -e MM_DB_USER=your_db_user \\\n                    -e MM_DB_PASS=your_db_pass -e MM_DB_HOST=aaa.us-west-2.redshift.amazonaws.com \\\n                    -e MM_DB_SCHEMA=your_db_schema -e MM_DB_TABLE=your_db_table \\\n               returnpath/redshift-mega-maid:latest\n\n# Working with the Container\n\n## Logs\n\nWithin the container the logs are simply printed to stdout via `/dev/stdout`.  To override this, set `MM_OUTPUT_FILE` to something else.  If you wish to persist logs, you will want to pass in a [volume mount][2] for the container to write the file to with `-v`.\n\n## Configuration\n\nBy default `docker run` will run with `ENTRYPOINT`:\n\n    /bin/sh -c /opt/mega-maid/bin/analyze-vacuum-schema.sh\n\nWhich runs the `analyze-vacuum-schema` python script with args:\n\n    /bin/sh -c python /opt/amazon-redshift-utils/src/AnalyzeVacuumUtility/analyze-vacuum-schema.py \\\n                      --db $MM_DB_NAME --db-user $MM_DB_USER --db-pwd $MM_DB_PASS \\\n                      --db-port $MM_DB_PORT --db-host $MM_DB_HOST \\\n                      --schema-name $MM_DB_SCHEMA  --table-name $MM_DB_TABLE \\\n                      --output-file $MM_OUTPUT_FILE --debug $MM_DEBUG \\\n                      --ignore-errors $MM_IGNORE_ERRORS  --slot-count $MM_SLOT_COUNT \\\n                      --min-unsorted-pct $MM_MIN_UNSORTED_PCT --max-unsorted-pct $MM_MAX_UNSORTED_PCT \\\n                      --deleted-pct $MM_DELETED_PCT --stats-off-pct $MM_STATS_OFF_PCT --max-table-size-mb $MM_MAX_TABLE_SIZE_MB\n\nThe default environment variables are set int the `Dockerfile` via `ENV`:\n\n    ENV MM_DB_SCHEMA public\n    ENV MM_DB_PORT 5439\n    ENV MM_OUTPUT_FILE /dev/stdout\n    ENV MM_DEBUG True\n    ENV MM_IGNORE_ERRORS False\n    ENV MM_SLOT_COUNT 2\n    ENV MM_MIN_UNSORTED_PCT 5\n    ENV MM_MAX_UNSORTED_PCT 50\n    ENV MM_DELETED_PCT 15\n    ENV MM_STATS_OFF_PCT 10\n    ENV MM_MAX_TABLE_SIZE_MB 700*1024\n\n\nSince we're using an `ENTRYPOINT`, rather than `CMD` in `Dockerfile` any flags you pass to `docker run` after the image will be appended after the `ENTRYPOINT`. This means that if you want to run other commands inside the container (e.g.: spawn a shell to interactively use other tools in [awslabs/amazon-redshift-utils][1]), you'll have to override the `ENTRYPOINT` with:\n\n    docker run --entrypoint='/path/to/your-entrypoint-here'\n\n## Getting into the Container\n\nIf for some reason you need to run a container and get a shell you need to override the entrypoint: `docker run -it -entrypoint=/bin/bash \u003cIMAGE\u003e`\n\n# License\n\nThis project is mostly packaging and wrapper scripts for the [`amazon-redshift-utils`][amazon-redshift-utils-license] tools.  As such, nothing in this repository is \"novel\", or \"non-obvious\". This repo is therefore released under the [Apache 2.0 License][apache-2-license].\n\nHowever, the upstream tools are released under other licenses:\n\n - [\"`amazon-redshift-utils`\"][amazon-redshift-utils-license] is released under the [Amazon Software License][asl].\n\nThe text of these tool's licenses are included here to avoid confusion.\n\n\n[1]: https://github.com/awslabs/amazon-redshift-utils.git\n[2]: https://docs.docker.com/engine/userguide/containers/dockervolumes/\n[apache-2-license]: https://choosealicense.com/licenses/apache-2.0/\n[asl]: http://aws.amazon.com/asl/\n[amazon-redshift-utils-license]: https://github.com/awslabs/amazon-redshift-utils/blob/master/LICENSE.txt\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrinitronx%2Fredshift-mega-maid","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftrinitronx%2Fredshift-mega-maid","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftrinitronx%2Fredshift-mega-maid/lists"}