{"id":20858750,"url":"https://github.com/thesephist/hfm","last_synced_at":"2026-03-16T11:34:08.641Z","repository":{"id":54639446,"uuid":"522157080","full_name":"thesephist/hfm","owner":"thesephist","description":"Hugging Face Download (Cache) Manager","archived":false,"fork":false,"pushed_at":"2022-08-07T10:15:27.000Z","size":6,"stargazers_count":21,"open_issues_count":1,"forks_count":0,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-19T07:24:19.929Z","etag":null,"topics":["cli","huggingface-transformers","oaklang"],"latest_commit_sha":null,"homepage":"","language":"Makefile","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thesephist.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-08-07T08:28:24.000Z","updated_at":"2024-01-04T17:10:57.000Z","dependencies_parsed_at":"2022-08-13T22:30:53.077Z","dependency_job_id":null,"html_url":"https://github.com/thesephist/hfm","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thesephist%2Fhfm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thesephist%2Fhfm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thesephist%2Fhfm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thesephist%2Fhfm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thesephist","download_url":"https://codeload.github.com/thesephist/hfm/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":243230101,"owners_count":20257644,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","huggingface-transformers","oaklang"],"created_at":"2024-11-18T04:47:12.848Z","updated_at":"2025-12-26T11:56:24.582Z","avatar_url":"https://github.com/thesephist.png","language":"Makefile","funding_links":[],"categories":[],"sub_categories":[],"readme":"# hfm 🤗\n\n**HFM** is the _Hugging Face Download (Cache) Manager_.\n\n_⚠NOTE️⚠️: HFM is not an official Hugging Face project, and not endorsed by the HF team. I just use `transformers` a lot, and made a little CLI to make the most of my disk space._\n\n[Hugging Face Transformers](https://huggingface.co/docs/transformers/index) is a great library, but in the course of working with many different models, your `.cache/huggingface/transformers` can fill up quickly with data. If you don't keep an eye on it (or if you, like me, don't have a lot of disk space), this folder can grow to dozens of GBs and eat up your disk space. HFM is a little command-line utility that helps keep an eye on this download cache folder, and easily remove/inspect cached Hugging Face Transformers downloads.\n\n## How to use `hfm`\n\nIn the most basic use case, just running `hfm` (or `hfm ls`, for which `hfm` is an alias) shows a table of all the downloads Huggingface has put in `~/.cache/huggingface/transformers/`, sorted by file size.\n\nThe table shows you the download size, the first 8 characters of the download ID (used in file names), download date and time, and the URL (within `https://huggingface.co/`) from where that file that was downloaded.\n\n```\n$ hfm\n548.12MB 752929ac 2021-11-20T07:05:57Z gpt2/resolve/main/pytorch_model.bin\n267.84MB 8d04c767 2021-11-20T07:03:48Z distilbert-base-uncased/resolve/main/pytorch_model.bin\n  1.37MB 6a79d94e 2022-04-21T23:32:37Z EleutherAI/gpt-j-6B/resolve/main/tokenizer.json\n798.16KB 3138d7eb 2022-04-21T23:32:34Z EleutherAI/gpt-j-6B/resolve/main/vocab.json\n456.36KB 2f9c5228 2022-04-21T23:32:36Z EleutherAI/gpt-j-6B/resolve/main/merges.txt\n  4.04KB 9b1d5815 2022-04-21T23:32:40Z EleutherAI/gpt-j-6B/resolve/main/added_tokens.json\n  1.35KB 42252c22 2021-12-29T19:13:56Z EleutherAI/gpt-neo-1.3B/resolve/main/config.json\n    762B f985248d 2022-04-21T22:37:09Z distilgpt2/resolve/main/config.json\n    200B 5fe35a59 2021-12-29T19:13:56Z EleutherAI/gpt-neo-1.3B/resolve/main/tokenizer_config.json\n     90B 953b5ce4 2021-12-29T19:35:21Z EleutherAI/gpt-neo-2.7B/resolve/main/special_tokens_map.json\n832.76MB total\n```\n\nTo remove any specific cached download, just run `hfm rm \u003cid\u003e`. This command can take many IDs at once, and will let you know if any ID you gave doesn't exist, or matches more than one download. Like `git checkout`, you don't have to type the full ID -- just the first few characters will do.\n\n```\n$ hfm rm 7529 8d04 13d2 9\nNo download with id \"2314\".\nMore than one download matching id \"9\".\n```\n\nYou can pass `--dry-run` or `-d` at the end to see which files will be deleted, without actually deleting them.\n\n```\n$ hfm rm 7529 --dry-run\n[dry-run] rm \"752929ace039baa8ef70fe21cdf9ab9445773d20e733cf693d667982e210837e.323c769945a351daa25546176f8208b3004b6f563438a7603e7932bae9025925\"\n[dry-run] rm \"752929ace039baa8ef70fe21cdf9ab9445773d20e733cf693d667982e210837e.323c769945a351daa25546176f8208b3004b6f563438a7603e7932bae9025925.lock\"\n[dry-run] rm \"752929ace039baa8ef70fe21cdf9ab9445773d20e733cf693d667982e210837e.323c769945a351daa25546176f8208b3004b6f563438a7603e7932bae9025925.json\"\n```\n\nFor composing commands with other UNIX tools, it's often nice to be able to get the full path to a downloaded file. `hfm which \u003cid\u003e` prints the full path to a file. With this, you can run `jq` on a downloaded JSON file, as an example:\n\n```\n$ cat $(hfm which 6a79) | jq '.added_tokens | map(.content)'\n[\n  \"\u003c|endoftext|\u003e\",\n  \"\u003c|extratoken_1|\u003e\",\n  \"\u003c|extratoken_2|\u003e\",\n  \"\u003c|extratoken_3|\u003e\",\n  \"\u003c|extratoken_4|\u003e\",\n  \"\u003c|extratoken_5|\u003e\",\n  \"\u003c|extratoken_6|\u003e\",\n  \"\u003c|extratoken_7|\u003e\",\n[...]\n```\n\nOf course, the output of `hfm ls` itself is made to be greppable and interoperate with standard UNIX utilities like `awk`. To see all files about the `gpt-j` model...\n\n```\n$ hfm | grep gpt-j\n  1.37MB 6a79d94e 2022-04-21T23:32:37Z EleutherAI/gpt-j-6B/resolve/main/tokenizer.json\n798.16KB 3138d7eb 2022-04-21T23:32:34Z EleutherAI/gpt-j-6B/resolve/main/vocab.json\n456.36KB 2f9c5228 2022-04-21T23:32:36Z EleutherAI/gpt-j-6B/resolve/main/merges.txt\n  4.04KB 9b1d5815 2022-04-21T23:32:40Z EleutherAI/gpt-j-6B/resolve/main/added_tokens.json\n    619B ec7f9c4f 2022-04-21T23:32:31Z EleutherAI/gpt-j-6B/resolve/main/tokenizer_config.json\n    357B 3cd6a981 2022-04-21T23:32:41Z EleutherAI/gpt-j-6B/resolve/main/special_tokens_map.json\n```\n\n`hfm` features two flags, `--no-total` and `--no-humanize`, that specifically make the `hfm ls` output more machine-readable.\n\nYou can read more about all the commands and flags from the help message, at `hfm help` or `hfm -h` / `hfm --help`.\n\n## Install\n\nIf you have [Oak](https://oaklang.org) installed, you can build from source (see below). Otherwise, I provide pre-built binaries for macOS (x86 and arm64) and Linux (x86) on the [releases page](https://github.com/thesephist/hfm/releases). Just drop those into your `$PATH` and you should be good to go.\n\n## Build and development\n\nHFM is built with my [Oak programming language](https://oaklang.org), and I manage build tasks with a Makefile.\n\n- `make` or `make build` builds a version of HFM at `./hfm`\n- `make install` installs HFM to `/usr/local/bin`, in case that's where you like to keep your bins\n- `make fmt` or `make f` formats all Oak source files tracked by Git\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthesephist%2Fhfm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthesephist%2Fhfm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthesephist%2Fhfm/lists"}