{"id":16849612,"url":"https://github.com/navytux/git-backup","last_synced_at":"2026-04-15T22:32:46.865Z","repository":{"id":146677729,"uuid":"134834219","full_name":"navytux/git-backup","owner":"navytux","description":"Backup set of Git repositories \u0026 just files; efficiently.  (mirror of https://lab.nexedi.com/kirr/git-backup)","archived":false,"fork":false,"pushed_at":"2025-02-14T15:23:11.000Z","size":233,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-18T07:45:30.451Z","etag":null,"topics":["backup","git","gitlab"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/navytux.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-05-25T09:23:07.000Z","updated_at":"2025-02-14T15:23:16.000Z","dependencies_parsed_at":"2024-06-14T15:38:43.354Z","dependency_job_id":"434b9ef3-2314-4448-9ba3-a41de19d370c","html_url":"https://github.com/navytux/git-backup","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/navytux/git-backup","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/navytux%2Fgit-backup","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/navytux%2Fgit-backup/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/navytux%2Fgit-backup/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/navytux%2Fgit-backup/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/navytux","download_url":"https://codeload.github.com/navytux/git-backup/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/navytux%2Fgit-backup/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279002105,"owners_count":26083307,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["backup","git","gitlab"],"created_at":"2024-10-13T13:16:34.895Z","updated_at":"2025-10-09T22:40:32.714Z","avatar_url":"https://github.com/navytux.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"=======================================================================\n Git-backup - Backup set of Git repositories \u0026 just files; efficiently\n=======================================================================\n\n:author: Kirill Smelkov \u003ckirr@nexedi.com\u003e\n:date:   2015 Aug 31\n\n\nThis program backups files and set of bare Git repositories into one Git repository.\nFiles are copied to blobs and then added to tree under certain place, and for\nGit repositories, all reachable objects are pulled in with maintaining index\nwhich remembers reference -\u003e sha1 for all pulled repositories.\n\nThis allows to leverage Git's good data deduplication ability, especially for\ncases when there are many hosted repositories which are forks of each other,\nand for backup to have history and be otherwise managed as a usual Git\nrepository.  In particular it is possible to use standard git pull/push to\nsynchronize backups in several places.\n\nThe original motivation for git-backup was to manage backups of `lab.nexedi.com`__\nwith being able to deduplicate content of forks, and to be able to track the\nwhole history of the site. The last property is similar to ZODB where Nexedi\nused to \"never pack\" and keep the whole history of the whole site. Please see\nthe Appendix for more details.\n\n__ https://lab.nexedi.com\n\n\nBackup workflow is:\n\n1. create backup repository::\n\n     $ mkdir backup\n     $ cd backup\n     $ git init         # both bare and non-bare possible\n\n2. pull files and Git repositories into backup repository::\n\n     $ git-backup pull dir1:prefix1 dir2:prefix2 ...\n\n   This will pull bare Git repositories \u0026 just files from `dir1` into backup\n   under `prefix1`, from `dir2` into backup prefix `prefix2`, etc...\n\n3. restore files and Git repositories from backup::\n\n     $ git-backup restore \u003cbackup-state-sha1\u003e prefix1:dir1\n\n   Restore Git repositories \u0026 just files from backup `prefix1` into `dir1`,\n   from backup `prefix2` into `dir2`, etc...\n\n   Backup state to restore is taken from \u003cbackup-state-sha1\u003e which is sha1 or\n   ref pointing to backup repository state.\n\n4. backup repository itself can be managed with Git. In particular it can be\n   synchronized between several places with standard git pull/push, be\n   repacked, etc::\n\n     $ git push ...\n     $ git pull ...\n\n\nPlease see `git-backup.go`__ source with technical overview on how it works.\n\nWe also provide convenience program to pull/restore backup data for a GitLab\ninstance into/from git-backup managed repository. See `contrib/gitlab-backup`__\nfor details.\n\n\n__ git-backup.go\n__ contrib/gitlab-backup\n\n\n--------\n\nAppendix. Original announcement\n===============================\n\n:Subject: [Nexedi] [ANNOUNCE] Program to backup several Git repositories into 1\n:From: Kirill Smelkov \u003ckirr@nexedi.com\u003e\n:Date: Mon, 31 Aug 2015 22:36:31 +0300\n\nHi All,\n\nRecently we had discussion with Kazuhiko on current GitLab backup state.\nGitLab approach is to create tarball for every repository and then\ncreate one big tar file containing everything. In presence of forks this\nresults in waste of disk space which gets worse the more forks and\npersonal repositories we have.\n\nEven today, when a lot of development happens not yet on GitLab, 1\nstandard GitLab backup takes ~ 3GB, which creates pressure for storage\nand consequently forces admin to make compromises wrt how long to keep\nbackup history. Again, this will become more heavy as we move more and\nmore to GitLab.\n\nSo clearly something has to be done.\n\nWith this email I propose the idea to backup Git hosting via Git itself.\nFor this we need to pull all hosted objects (from all git repositories)\ninto 1 git database and then leverage Git's good ability to deduplicate\nand pack content. Plus we need to carefully remember which refs from\nwhich repositories point to which objects so we can properly restore.\n\nThat's basically all. I've tried to do a POC which is available here:\n\n    https://lab.nexedi.cn/kirr/git-backup\n\nand contains more details. The main program[1] is generic + there is\nconcrete driver to backup GitLab repositories together with database\ndump and everything else[2].\n\nIt has been tested by me on our GitLab instance manually for some time\nalready and preliminarily results are::\n\n                                    GitLab          POC\n\n    time of 1st run                 2m25s           7m41s\n    backup size after 1st run       3013MB          363MB\n\n    time of 2nd run                 1m28s           1m52s\n    (with small commit)\n\n    backup size increase            +3013MB         +4MB (*)\n    after 2nd run\n\n    (*) I've tracked this +4MB to the fact that git leaves empty directory\n        refs/backup/\u003cdir\u003e/ if e.g. refs/backup/\u003cdir\u003e/some-ref was deleted and\n        \u003cdir\u003e becomes empty. This can be improved in git itself or worked around\n        in the tool. Actual data growth in db objects is few kilobytes.\n\n\nIn other words backup size is already ~10 times smaller compared to\nGitLab default and because size increase on incremental runs is small on\naverage, it creates practical ability to store backup history forever,\njust like we do with histories in usual Git repositories.\n\nRestoration process has been also verified manually, and besides that, on\neach restore run, the program verifies extracted git repositories for\nconnectivity correctness. So in my view this should be safe to use.\n\n...\n\nI welcome feedback, questions and review of the tool. If all goes well\nand we use it on our GitLab instance for some time ok, my idea is to\nmake the announcement to a wider audience.\n\n...\n\n| Thanks,\n| Kirill\n\n| [1] https://lab.nexedi.cn/kirr/git-backup/blob/master/git-backup\n| [2] https://lab.nexedi.cn/kirr/git-backup/blob/master/contrib/gitlab-backup\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnavytux%2Fgit-backup","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnavytux%2Fgit-backup","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnavytux%2Fgit-backup/lists"}