{"id":17383105,"url":"https://github.com/numblr/glaciertools","last_synced_at":"2025-04-15T09:52:29.204Z","repository":{"id":45306465,"uuid":"140686147","full_name":"numblr/glaciertools","owner":"numblr","description":"Command line (bash) scripts to upload large files to AWS glacier using multipart upload and to calculate the required tree hash","archived":false,"fork":false,"pushed_at":"2021-12-22T17:37:19.000Z","size":3984,"stargazers_count":67,"open_issues_count":3,"forks_count":20,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-03-28T19:06:55.376Z","etag":null,"topics":["amazon-glacier","aws","aws-cli","aws-glacier","bash-script","command-line","command-line-tool","glacier","hash-functions","merkel-tree","script","shell-scripts","treehash"],"latest_commit_sha":null,"homepage":"","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/numblr.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-07-12T08:42:24.000Z","updated_at":"2025-01-25T16:09:04.000Z","dependencies_parsed_at":"2022-08-20T09:00:30.455Z","dependency_job_id":null,"html_url":"https://github.com/numblr/glaciertools","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/numblr%2Fglaciertools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/numblr%2Fglaciertools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/numblr%2Fglaciertools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/numblr%2Fglaciertools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/numblr","download_url":"https://codeload.github.com/numblr/glaciertools/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249048712,"owners_count":21204305,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["amazon-glacier","aws","aws-cli","aws-glacier","bash-script","command-line","command-line-tool","glacier","hash-functions","merkel-tree","script","shell-scripts","treehash"],"created_at":"2024-10-16T07:40:36.886Z","updated_at":"2025-04-15T09:52:29.180Z","avatar_url":"https://github.com/numblr.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Command line tools (Bash scripts) to upload large files to AWS Glacier\n\nAn archive containing only the scripts can be downloaded from the [releases](https://github.com/numblr/glaciertools/releases) page. Some of the scripts depend on others and assume that they are in the same directory.\n\n## Commands\n**[glacierupload](#glacierupload)**\u003cbr\u003e\n**[glacierabort](#glacierabort)**\u003cbr\u003e\n**[treehash](#treehash)**\n\n## glacierupload\n\nThe script orchestrates the multipart upload of a large file to AWS Glacier.\n\n**Prerequisites**\n\nThis script depends on **openssl** and **parallel**. If you are\non Mac OS X and using Homebrew, then run the following:\n\n    brew install parallel\n    brew install openssl\n\nThe script assumes you have an AWS account, and have signed up for the glacier\nservice and have created a vault already.\n\nIt also assumes that you have the\n[AWS Command Line Interface](http://docs.aws.amazon.com/cli/latest/userguide/installing.html)\ninstalled on your machine, e.g. by:\n\n    pip install awscli\n\nThe script requires also that the aws cli is configured with your AWS credentials.\nOptionally it supports profiles setup in the aws cli by\n\n    aws --profile myprofile configure\n\nYou can verify that your connection works by describing the vault you have created:\n\n    aws --profile myprofile glacier describe-vault --vault-name myvault --account-id -\n\n\n**Script Usage**\n\n    glacierupload [-p|--profile \u003cprofile\u003e] [-d|--description \u003cdescription\u003e] [-s|--split-size \u003clevel\u003e]\n                   \u003c-v|--vault vault\u003e \u003cfile...\u003e\n\n    -v --vault        name of the vault to which the file should be uploaded  \n    -p --profile      optional profile name to use for the upload. The profile\n                      name must be configured with the aws cli client.\n    -d --description  optinal description of the file\n    -s --split-size   level that determines the size of the parts used for\n                      uploading the file. The level can be a number between\n                      0 and 12 and results in part size of (2^level) MBytes.\n                      If not specified the default is 0, i.e. the file is\n                      uploaded in 1MByte parts.\n    -h --help         print help message\n\nThe script prints the information about the upload to the shell and\nadditionally stores it in a file in the directory were the script is executed.\nThe file name equals the original file name postfixed with the first 8 characters\nof the archive id and '.upload.json'.\n\nThe script splits the file to upload on the fly and only stores parts that are\ncurrently uploaded temporarily on disk, i.e. the amount of required free disk\nspace is low and depends on the used chunk size and number of parallel uploads.\nThe size of the individual chunks can be controlled by the *--split-size* option.\nThe number of parallel uploads is determined by parallel based on the number of\navailable CPUs.\n\n**Be aware of the [constraints](https://docs.aws.amazon.com/amazonglacier/latest/dev/uploading-archive-mpu.html#qfacts)\non the number and size of the chunks in the AWS Glacier specifications!**\n\nIn case the upload of a part fails, the script performs a number of retries. If\nthe upload of a part ultimately fails after the maximum number of retries, the\nscript aborts the upload and terminates.\n\n**Examples**\n\nTo simply upload */path/to/my/archive* to *myvault* use\n\n    \u003e ./glacierupload -v myvault /path/to/my/archive\n\nThis will upload the archive in 1MByte chunks using the standard credentials\nthat are configured for the aws cli.\n\nThe following command\n\n    \u003e ./glacierupload -p my_aws_cli_profile -v myvault -s 5 -d \"My favorite archive\" /path/to/my/archive\n\nwill upload */path/to/my/archive* to *myvault* on AWS glacier with a short\ndescription. The credentials that were configured in the *my_aws_cli_profile*\nin the aws cli will be used. Instead of the default part size of 1MB the\narchive is uploaded in 2^5=32MByte chunks.\n\n\n## glacierabort\n\nAbort (close) all unfinished uploads to a vault on AWS Glacier.\n\n**Script Usage**\n\n    glacierabort -v|--vault \u003cvault\u003e [-p|--profile \u003cprofile\u003e]\n\n    -v --vault        name of the vault for which uploads should be aborted  \n    -p --profile      optional profile name to use. The profile name must be\n                      configured with the aws cli client.\n    -h --help         print help message\n\n**Examples**\n\nTo abort all currently unfinished uploads run\n\n    \u003e ./glacierabort -v myvault\n\n\n## treehash\n\nThe script calculates the top level hash of a Merkel tree (tree hash) built from\nequal sized chunks of a file.\n\nIf possible, i.e. if multiple CPUs are available on your system, the script\nparallelizes the computation of the tree hash.\n\nThe script does not depend on any of the other scripts in this repository and can\nbe used stand-alone.\n\n**Prerequisites**\n\nThis script depends on **parallel** and **openssl**. If you are on Mac OS X\nand are using Homebrew, then run the following:\n\n    brew install openssl\n    brew install parallel\n\n**Script Usage**\n\n    treehash [-b|--block \u003csize\u003e] [-a|--alg \u003calg\u003e] [-v|--verbose \u003clevel\u003e] \u003cfile\u003e\n\n\n    -b --block       size of the leaf data blocks in bytes, defaults to 1M.\n                     can be postfixed with K, M, G, T, P, E, k, m, g, t, p, or e,\n                     see the '--block' option of the 'parallel' command for details.\n    -a --alg         hash algorithm to use, defaults to 'sha256'. Supported\n                     algorithms are the ones supported by 'openssl dgst'\n    -v  --verbosity  print diagnostic messages to stderr if level is larger than 0:\n                      * level 1: Print the entire tree\n                      * level 2: Print debug information\n    -h --help        print help message\n\nThe script does not create any temporary files nor does it require that the chunks\nof the file are present as files on the disk.\n\n**Examples**\n\nTo calculate the tree hash of */path/to/my/archive* with a chunk size of 1MB and\nthe *sha-256* hash algorithm use\n\n    \u003e ./treehash /path/to/my/archive\n\n\n## References\n\n* O. Tange (2011): GNU Parallel - The Command-Line Power Tool, ;login: The USENIX Magazine, February 2011:42-47.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnumblr%2Fglaciertools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnumblr%2Fglaciertools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnumblr%2Fglaciertools/lists"}