{"id":44575788,"url":"https://github.com/quodlibetor/s3glob","last_synced_at":"2026-05-15T02:02:08.855Z","repository":{"id":268479906,"uuid":"903568493","full_name":"quodlibetor/s3glob","owner":"quodlibetor","description":"A fast aws s3 ls and download cli that supports glob patterns","archived":false,"fork":false,"pushed_at":"2026-05-14T23:55:30.000Z","size":861,"stargazers_count":15,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-05-15T01:39:10.354Z","etag":null,"topics":["aws","cli","ls","s3"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/quodlibetor.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE-APACHE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-12-14T23:51:39.000Z","updated_at":"2026-05-14T23:55:34.000Z","dependencies_parsed_at":"2024-12-17T03:20:22.903Z","dependency_job_id":"04d4b21f-0aeb-41da-831e-a16364c63cb0","html_url":"https://github.com/quodlibetor/s3glob","commit_stats":null,"previous_names":["quodlibetor/s3glob"],"tags_count":31,"template":false,"template_full_name":null,"purl":"pkg:github/quodlibetor/s3glob","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quodlibetor%2Fs3glob","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quodlibetor%2Fs3glob/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quodlibetor%2Fs3glob/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quodlibetor%2Fs3glob/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/quodlibetor","download_url":"https://codeload.github.com/quodlibetor/s3glob/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/quodlibetor%2Fs3glob/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33050705,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T13:14:54.681Z","status":"online","status_checked_at":"2026-05-15T02:00:06.351Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","cli","ls","s3"],"created_at":"2026-02-14T05:02:11.652Z","updated_at":"2026-05-15T02:02:08.846Z","avatar_url":"https://github.com/quodlibetor.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"# s3glob\n\ns3glob is a fast aws s3 list implementation that basically obeys standard unix\nglob patterns.\n\nIn my experience (on an ec2 instance) s3glob can list 10s of millions of files\nin about 5 seconds, where I gave up on `aws s3 ls` after 5 minutes.\n\n![s3glob in action](./static/s3glob.gif)\n\n## Status\n\ns3glob is basically complete. It does all the things I need. If you have any\nfeature requests or bug reports please open an issue.\n\n## Usage\n\nThese two commands are equivalent:\n\n```bash\ns3glob ls \"s3://my-bucket/a*/something/1*/other/*\"\ns3glob ls      \"my-bucket/a*/something/1*/other/*\"\n```\n\nOutput is in the same format as `aws s3 ls`, but you can change it with the `--format` flag.\nFor example, this will output just the `s3://\u003cbucket\u003e/\u003ckey\u003e` for each object:\n\n```bash\ns3glob ls -f \"{uri}\" \"s3://my-bucket/a*/something/1*/other/*\"\n```\n\nYou can also download objects:\n\n```bash\ns3glob dl \"s3://my-bucket/a*/something/1*/other/*\" my-local-dir\n```\n\nLocal files will always be unique (two objects with the same filename won't stomp on each other).\nSee `s3glob dl --help` to configure exactly how local paths are created.\n\n### Installation\n\n#### Install prebuilt binaries via shell script\n\n```bash\ncurl --proto '=https' --tlsv1.2 -LsSf https://github.com/quodlibetor/s3glob/releases/latest/download/s3glob-installer.sh | sh\n```\n\n#### Install prebuilt binaries via powershell script\n\n```powershell\npowershell -ExecutionPolicy ByPass -c \"irm https://github.com/quodlibetor/s3glob/releases/latest/download/s3glob-installer.ps1 | iex\"\n```\n\n#### Install prebuilt binaries via Homebrew\n\n```bash\nbrew install quodlibetor/tap/s3glob\n```\n\n### Syntax\n\nGlob syntax supported:\n\n- `*` matches any number of non-delimiter characters. The default delimiter is `/`.\n- `?` matches any single character. By default this includes the\n  delimiter; pass `--no-cross-delim` to restrict `?` to a single\n  segment.\n- `[abc]`/`[!abc]` matches any single character in/not in the set. By\n  default the negated form `[!abc]` may also match the delimiter; pass\n  `--no-cross-delim` to keep it single-segment.\n- `[a-z]`/`[!a-z]` matches any single character in/not in the range,\n  with the same `--no-cross-delim` rule for the negated form.\n- `{a,b,c}` matches any of the comma-separated options (but nested globs are not\n  supported). Empty alternatives are allowed: `{a,}` matches either `a` or\n  the empty string.\n- `**` matches any number of characters, including the delimiter. At a `**`,\n  `s3glob` discovers sub-prefixes via a bounded breadth-first walk so it can\n  list them in parallel; if your bucket shape isn't suited to that, pass\n  `--no-recursive-auto-parallel` to skip the walk.\n- A pattern (or any brace alternative) ending in `/` implicitly matches\n  everything inside that directory: `s3glob ls 'foo/'` lists every object\n  under `foo/`.\n\n### Differences from standard glob and globset\n\n`s3glob`'s syntax overlaps with traditional Unix glob, but with a few\nintentional deviations driven by the S3-listing model:\n\n- **`**` works anywhere, not only as a path component.** Most glob\n  implementations require `**` to stand alone between delimiters (e.g.\n  `a/**/b`). In `s3glob`, `**` compiles to \"any chars including the\n  delimiter\" wherever it appears: `a**b` matches `a/x/y/b`, and `{x,y}**`\n  matches anything starting with `x` or `y`.\n- **Negated character classes and `?` can be made single-segment.** By default\n  `?` matches any character, and `[!a]` matches any non-`a` character including\n  the delimiter. Pass `--no-cross-delim` (or set `S3GLOB_CROSS_DELIM=false`) to\n  restrict `[!a]` to a single segment. A future major version will flip this\n  default to single-segment.\n- **Empty brace alternatives are first-class.** `{a,}`, `{,a}`, and\n  `{a,,b}` are all valid; the empty alt matches the empty string. Many\n  glob implementations reject these.\n- **Trailing `/` is \"match everything inside this directory\".** Pattern\n  `foo/` is internally rewritten to `foo/*`-equivalent.\n\n### Algorithm and performance implications\n\nThe tl;dr is that, up until the point a pattern has a `**` in it, `s3glob` will\nsearch within directories filtering by any constants in the pattern to reduce\nthe number of objects that need to be scanned:\n\n- fastest: `bucket/a*/b*/**`\n- fast: `bucket/*a*/*b*/**`\n- full scan: `bucket/**a**/b**`\n\nAWS S3 allows us to enumerate objects within a prefix, but it does not natively\nallow any filtering. `s3glob` works around this by enumerating prefixes and\nmatching them recursively against the provided glob pattern.\n\nI have observed s3glob to be able to list hundreds of thousands of objects in a\ncouple of seconds from within an ec2 instance.\n\nA `**` is where prefix-narrowing ends — segments after it can't be turned\ninto prefix filters. But `s3glob` still tries to parallelize the recursive\nlisting itself: at the `**`, it walks one directory level at a time with\ndelimiter-aware `LIST` calls and then scans the discovered sub-prefixes\nconcurrently. For buckets with a broad subtree under `**` this turns a\nsingle-stream recursive list into a parallel scan. For buckets with extremely\nwide it can cost extra `LIST` calls without much payoff. Pass\n`--no-recursive-auto-parallel` to force `**` to immediately become a serial\nlist.\n\nWhat this means in general is that, if you have a keyspace that looks like:\n\n```\n2000_01_01-2024_12_31/a-z/0-999/OBJECT_ID.txt\n```\n\nwhere each `-` represents the values in between, then you can roughly determine\nhow many objects S3Glob will need to list by multiplying the number of\nvalues in each range. Adding a filter can reduce that number.\n\nSome example approximate numbers:\n\n| Pattern | Approximate number of objects | Reason |\n|---------|--------------------------------|--------|\n| `s3glob ls 2000_01_01/a/*/OBJECT_ID.txt` | 1,000 | 0-999 = 1000 |\n| `s3glob ls 2000_01_01/[abc]/*/OBJECT_ID.txt` | 3,000 | (a + b + c) * 0-999 = 3 * 1000 |\n| `s3glob ls 2000_01_01/*/*/OBJECT_ID.txt` | 26,000 | a-z * 0-999 = 26 * 1000 |\n| `s3glob ls 2000_01_01/[!xyz]/*/OBJECT_ID.txt` | 23,026 | (list all of a-z) = 26 =\u003e (filter out x,y,z) =\u003e 23 * 1,000 = 23,000 |\n| `s3glob ls 2000_01_*/*/*/OBJECT_ID.txt` | 806,000 | 01-31 * a-z * 0-999 = 31 * 26 * 1000 |\n\n## Copying\n\nAll code is available under the MIT or Apache 2.0 license, at your option.\n\n## Development\n\n### Performing a release\n\nEnsure git-cliff and cargo-release are both installed (run `mise install` to get them)\nand run `cargo release [patch|minor]`.\n\nIf things look good, run again with `--execute`.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquodlibetor%2Fs3glob","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquodlibetor%2Fs3glob","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquodlibetor%2Fs3glob/lists"}