{"id":20331151,"url":"https://github.com/comcast/svn-to-github","last_synced_at":"2025-04-11T21:07:32.749Z","repository":{"id":88391724,"uuid":"93768987","full_name":"Comcast/svn-to-github","owner":"Comcast","description":"Comprehensive Tool for Converting SVN to Git in Bulk","archived":false,"fork":false,"pushed_at":"2017-06-13T20:44:15.000Z","size":83,"stargazers_count":13,"open_issues_count":0,"forks_count":8,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-04-11T21:07:16.084Z","etag":null,"topics":["git","github","github-enterprise","scm","svn"],"latest_commit_sha":null,"homepage":"https://github.com/Comcast/svn-to-github","language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Comcast.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-06-08T16:13:05.000Z","updated_at":"2024-07-24T18:44:49.000Z","dependencies_parsed_at":"2023-03-13T18:24:57.516Z","dependency_job_id":null,"html_url":"https://github.com/Comcast/svn-to-github","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Comcast%2Fsvn-to-github","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Comcast%2Fsvn-to-github/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Comcast%2Fsvn-to-github/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Comcast%2Fsvn-to-github/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Comcast","download_url":"https://codeload.github.com/Comcast/svn-to-github/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248480434,"owners_count":21110937,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["git","github","github-enterprise","scm","svn"],"created_at":"2024-11-14T20:18:58.751Z","updated_at":"2025-04-11T21:07:32.732Z","avatar_url":"https://github.com/Comcast.png","language":"Shell","readme":"# svn-to-github\nSVN to Github Tool for Public and Enterprise\n[![GitHub release](https://img.shields.io/github/release/Comcast/svn-to-github.svg)](https://github.com/Comcast/svn-to-github/releases/download/1.16.0/svn-to-github-1.16.0-0.noarch.rpm) [![GitHub issues](https://img.shields.io/github/issues/comcast/svn-to-github.svg)](https://github.com/Comcast/svn-to-github/issues) [![GitHub contributors](https://img.shields.io/github/contributors/comcast/svn-to-github.svg)](https://github.com/Comcast/svn-to-github/graphs/contributors) [![license](https://img.shields.io/github/license/comcast/svn-to-github.svg)](https://github.com/Comcast/svn-to-github/blob/master/LICENSE)\n\n## Overview\n\nThis routine will convert an SVN repo to github. If it has nested projects, those projects will become git submodules linked to their nearest ancestor.\nThe git master branch is created from trunk, and a .gitinore file is added from svn properties.\nTrunk is left intact and untouched as a branch, unless `--no-branches` or `--no-preserve-trunk` is used, in which case only master will exist, which will be the original trunk converted\nThere is a .gitignore file that is created from the original repo, and used as the first commit on the new master branch.\nFiles and folders above the trunk are considered part of the repository and are added to the new master branch as the final conversion commit.\nTrunk directories are created in each master branch and point to the trunk branch to maintain the trunk directory tree for trunks, but can be disabled through option `--no-preserve-trunk`.\nNested repos Under Trunk only exist in the master branch, and not in the trunk branch.\nLDAP is required to be configured in order for svn Author lookups to work when building the \"SVN Authors File\" for full names and email addresses, otherwise svn commits will not be linked to their users in github.\n\n__Conversion time could very greatly__\nConversion time factors are, directory size, commit length of branches/tags and trunk(s) and the greatest factor of all is missing branches or tags. When a branch is deleted from svn permanently, the conversion process must start over from the beginning of the commit chain for every occurrence. It is highly advisable one does not attempt to prune out unwanted branches or tags prior to conversion, but rather after because doing so will result in longer conversion time and errors in the logs.\n\n## Features\n\n1. Preserves the 'svn trunk' directory structure using git submodules [see caveats and limitations]\n2. Handles naming conflicts in both the source repo and in github\n  * Duplicate named projects in svn will be prefixed with their nearest ancestor's repo name\n  * If a repo exists in github with the same name in the same namespace for a submodule, that submodule will be suffixed with the first non-conflicting integer\n3. The program will fail immediately if the source repo name already exists in github in the namespace specified\n4. Converts files of a specified size to git-lfs for tracking independently of git compression\n5. Will remove deleted files of a specified size from git history\n6. The original svn tree is saved in the parent repo's master branch in a hidden file called `.svn_tree`\n7. All logs are preserved in the archive directory for each job\n8. Ability remove repos from github that were converted as a batch using the `--undo` flag along with the `--svn`, `--ghuser`, and `--token` flags and also the `--org` flag if writen to a particular Org's namespace. Useful for sandboxing and redoing jobs with different parameters. Only the `*.json` files need to be preserved which are also saved in the archive dir.\n9. Removes the ability to make revisions back to SVN, this is a 1 way conversion, repo syncing is disabled intentionally, revs made during a conversion will not be reflected, so prepare to wait. Conversion time is a function of `revs * branches||tags`\n\n#### Options\n\nAll options from the command line must be given in the form of `--option=value` including the csv list ie: `--tags=tags,tag,releases`\n\n* __ghuser__            `[default: prompt ]` [Join Github github.com/join](https://github.com/join)\n* __token__             `[default: prompt ]` [github.com/settings/tokens/new](https://github.com/settings/tokens/new)\n* __svn-url__           `[default: prompt ]` The SVN Source URL\n* __svn-user__          `[default: null ]`   The SVN Source Username\n* __svn-pass__          `[default: null ]`   The SVN Source Password\n* __org__               `[default: prompt ]` [Organization Docs help.github.com/articles/about-organizations](https://help.github.com/articles/about-organizations/)\n* __no-org__            `[default: false ]` - don't create repos under an org, but rather under a user instead, caveat is repo(s) cannot already exist in user's primary Organization\n* __no-branches__       `[default: false ]` - do not convert branches, ignore all branches, should NOT be combined with `--no-preserve-trunk`\n* __no-tags__           `[default: false ]` - do not convert tags, ignore all tags\n* __no-tag-branches__   `[default: false ]` - do not allow tags to be branches, convert tags to releases only\n* __no-preserve-trunk__ `[default: false ]` - do not keep the trunk directory, so trunk becomes the root of master, should NOT be combined with `--no-branches`\n* __no-other-repos__    `[default: false ]` - do not create submodule repos, only create the primary repo but nest them if they already exist\n* __private__           `[default: false ]` - make all repo(s) private in github\n* __check-size__        `[default: false ]` - estimate the size on disk needed for conversion of non-standard-layouts, can be very time consuming\n* __safe__              `[default: false ]` - be prompted before all github repository creations or deletions when combined with the undo option\n* __install__           `[default: false ]` - dependency installation, Warning: May Fail if dependencies are not met.\n* __force__             `[default: false ]` - force the creation, will delete all data from previous run and possibly create duplicate submodule repos with suffixes in github\n* __undo__              `[default: false ]` - delete the github repos from a previous run, but preserve all log data\n* __new-name__          `[default: null ]` - rename your new repo name in github, required if name already exists in github under user or organization\n* __trunk__             `[default: null ]` - a csv list of alternative trunk dir names or specify just one\n* __tags__              `[default: null ]` - a csv list of alternative tag dir names or specify just one\n* __branches__          `[default: null ]` - a csv list of alternative branch dir names or specify just one\n* __svn-prefix__        `[default: null ]` - specify a prefix given to an otherwise standard-layout of svn directories\n* __authors__           `[default: null ]` - path of authors file rather than generate one, may not provide correct email address for users who do not conform to first_last@yourdomain.com\n* __ignore__            `[default: null  ]` - path of import file for gitignore file rather than convert it from the existing svn ignore attributes\n* __lfs-limit__         `[default: 150 ]` - specify the large-file-size limit, 2 ~ 2 Megabytes, 100K ~ 100 Kilobytes, maximum in github is 50, [2-5] is optimal\n* __blob-limit__        `[default: 150 ]` - specify the max size of blobs allowed, 50 ~ 50 Megabytes, maximum in github is 50, \u003c 50 is optimal\n* __work-dir__          `[default: /opt/svn-to-github ]` - to specify a working directory for creating files\n\n#### Requirements\n\n  * Must be used on an Enterprise Linux 7 system (RHEL/CentOS/Fedora/Oracle) or with the prerequisites met use `--skip-install` __NOTE:__ git \u003e 1.8.2 package on el6 does not exist thus requires src build, which is why EL7 is the requirement for the package install, early builds ran on Debian LTS but have since been discontinued\n  * Must have account in github with permissions to create repos in organization\n  * Must have root access or the package `git-svn and `git-lfs` already installed\n  * __FOR ENTERPRISE ONLY__ without LDAP access, the script cannot query the directory for real names and email addresses of authors found in the svn history, therefore all historical commits will have generic \u0026 invalid names and email addresses Unknown_ghuser Missing_ghuser@yourdomain.com\n\n#### Software\n\nThe software used to perform the tasks are common tools loaded from any bash 4.x shell, the following are installed required on EL systems:\n\n * git \u003e= 1.8.2\n * bash \u003e= 4.0\n * subversion\n * git-lfs (installed by first run)\n * git-svn\n * java-1.7.0-openjdk\n * expect\n * curl\n * gawk\n * findutils\n * coreutils\n * sed\n * yum\n * grep\n * procps\n * glibc-common\n * which\n * initscripts\n * tree\n\n## Setup\n\n1. Create a personal access API token with full permissions in github. [github.com/settings/tokens/new](https://github.com/settings/tokens/new)\n\n__TOKEN IS REQUIRED__ a password is not sufficient when using --org=organization\n\n  * Install the git-lfs\n`\nyum -y install https://packagecloud.io/github/git-lfs/packages/el/7/git-lfs-2.1.1-1.el7.x86_64.rpm/download\n`\n  * Install from RPM on EL7\n`\nyum -y install https://github.com/comcast/svn-to-github/releases/download/v1.16.0/svn-to-github-1.16.0.el7.noarch.rpm\n`\n\n## Usage\n\n__USE NOHUP__ because your session could likely timeout before conversion completes. Or Feel free to send a pull request\n\n__Logs:__ `/opt/svn-to-git/$REPO/$REPO.log`\n__Jobs:__ `/opt/svn-to-git/archive`\n\n__Convert a Single Repo to Org:__\n`\nnohup svn-to-github --no-other-repos --org=svn2git --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 \u0026\n`\n\n\n__Convert a Repo to Org:__\n`\nnohup svn-to-github --org=svn2git --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 \u0026\n`\n\n__Convert a Repo to User:__\n`\nnohup svn-to-github --no-org --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 \u0026\n`\n\n__Convert a Secure SVN Repo:__\n`\nnohup svn-to-github --no-org --svn-url=http://svn.yourdomain.net/repos/svn_repo --svn-user=svn2gituser --svn-pass='s3cret!' --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 \u0026\n`\nThe password must be quoted in single quotes if a standard escape character is part of the string\n\n__Convert a SVN Repo to a Private Repo:__\n`\nnohup svn-to-github --private --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 --org=aps \u0026\n`\n\n__Convert with lots of options:__\n`\nnohup svn-to-github --private --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 --org=aps --check-size --lfs-limit=50 --blob-limit=50 --work-dir=/home/swizzley --new-name=myrepo  --svn-user=svn2gituser --svn-pass=s3cret --no-tag-branches --no-preserve-trunk --no-other-repos --install --force \u0026\n`\ncheck-size can take a very long time, but a drop in the bucket for large repos with long chains.\nforcing will delete a previous checkout\n\n__Convert but provide pre-formatted Authors file:__\n`\nnohup svn-to-github --private --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 --no-org --blob-limit=50 --work-dir=/home/swizzley --authors=/home/swizzley/svn_authors.txt \u0026\n`\n\n__Convert but provide Ignore file:__\n`\nnohup svn-to-github --private --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 --no-org --blob-limit=50 --work-dir=/home/swizzley --ignore=/home/swizzley/svn_ignore.txt \u0026\n`\nIgnore file is Global to repo and all sub repos, if a conflict exists in dir levels for nested repos, modify the ignore attributes in SVN before hand or provide a file to use instead and resolve the conflicts after the conversion, anything ignored will not be added to the final git repo(s)\n\n__Convert but provide Unique Names for SVN Dirs:__\n`\nnohup svn-to-github --private --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 --no-org --branches=patches,releases,Patch,Revs --tags=10-14-2003,Jun_2014 --svn-prefix=_svn- \u0026\n`\nThis is in addition to, not in replacement of svn standard-layout directories (trunk,tags,branches)\n\n__Convert and manually approve every GHE repo name one by one:__\n`\nnohup svn-to-github --private --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 --no-org --safe \u0026\n`\nRequires user to type 'yes' before every repo is created, and requires the checkout to be complete, (NOT Advisable for long commit chains, large repos over ssh due to timeouts)\n\n__Convert and rename the GHE Repo:__\n`\nnohup svn-to-github --private --svn-url=http://svn.yourdomain.net/repos/svn_repo --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 --no-org --new-name=myrepo \u0026\n`\nThis will all rename the Prefix of all submodules nested inside\n\n__Undo Conversion by DELETING ALL REPOS converted in GHE under Org:__\n`\nsvn-to-github --svn-url=http://svn.yourdomain.net/repos/svn_repo --org=aps --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 --undo\n`\n\n__Undo Conversion by DELETING ALL REPOS converted in GHE under User:__\n`\nsvn-to-github --svn-url=http://svn.yourdomain.net/repos/svn_repo --no-org --ghuser=swizzley --token=d938a09236f449c5ea9f6bcfbcb64bacda55e620 --undo\n`\n\n## Log-Analysis\n\nUnder each conversion job is a log directory, inside that directory shows the real-time status of any particular stage of the job. These logs along with the json files used to create and delete repos with --undo in GHE are archived in the working-dir's archive folder with the name and date that the job finished. The `report.log` file will show any failures, conversion failures typically occur when the commit chain in svn is broken and cannot be resolved automatically through the git svn clone process. These failures must be resolved by hand afterwards, typically by specifying the Rev number in SVN or by ignoring what was broken. There is a `retry.log` that shows when a process failed for some reason, and that process is given three attempts to work before being logged as a failure.\n\n## Post-Conversion-Git-Usage\n\n__Clone the Full Repo:__\n`\ngit clone --recursive $REPO_URL\n`\n\n__Clone a particular branch Only:__\n`\ngit clone -b $BRANCH_NAME $REPO_URL\n`\n\n__Update all submodules:__\n`\ngit submodule update --recursive\n`\n\n## Caveats and Limitations for non-standard svn repo layouts\n\n1. Branches only get converted if they exist in a case insensitive dir `*/branches` by default\n  * alternative branch directories can be specified as a csv list using the Argument `--branches=/path/to/file.csv`\n  * or specify a single alternative with the name of the label, like this:  `--branches=some_label`\n  * skip branches entirely with `--no-branches` any alternative branch label must be specified with `--branches` or the contents will be added as files under master\n  * original branch directories are __NOT PRESERVED__\n2. Tags get converted to tags and branches by default\n  * skip converting tags to branches with `--tags-not-branches`\n  * skip tags entirely with `--no-tags`\n  * original tag directories are __NOT PRESERVED__\n3. Spaces in repo names will be replaced with underscores to avoid ugly unicode in github\n4. Trunks nested under tags or branches are __NOT PRESERVED__\n5. Empty directories are __NOT PRESERVED__\n6. Large files are tracked using `git lfs`, which uses patterns, those patterns are based on the file extension of a file equal to or over the `--lfs-limit`, if that file does not have an extension, the full name of the file is used.\n7. Directory tree overall is preserved except directories for tags and branches which become git tags and git branches, and trunk directory is only preserved if it was named trunk, otherwise one can use the `--no-preserve-trunk` option to accept the converted directory structure without any trunk directories\n\n### Routine\n\n1. Collect svn source repo location; use the svn argument to skip prompt eg: `--svn=https://server/svn/myrepo`\n2. Creates a work dir in `/opt/svn-to-github/$myrepo`; but one can be specified using `--work-dir=/opt/myrepo`\n3. Collect and or validate authentication to github; specify the org, ghuser and token arguments to skip prompt eg: `--org=myorg --ghuser=username --token=password`\n4. Checks to see if svn repo name exists in github; if it does it fails and prompts to run again using the argument `--rename=my-new-repo-name`\n5. Checks for `--undo` flag to see if user wants to reverse the process from a previous run, in which case it will remove any repos that it created in github and exit afterwards\n6. Checks that the user running is `root`; if not it will check if requirements exist; otherwise it will exit\n7. Checks to see if requirements are installed and permissions in work-dir are met\n8. Optionally checks disk space and repo's size to estimate if adequate disk space exists in work-dir mount (off by default because checking a remote location is slow)\n9. Checkout local copy of svn repo URL\n10. Generates authors file from svn log to maintain commit history in github, [optionally] using `---authors=some_file` skips this step\n11. Checks for nested repos\n12. Creates all repo(s) in github, nested repos are created with a prefix of their parent repo's name in github, any duplicates will have a suffix number\n13. Begins cloning any nested repos individually and concurrently\n14. Nested repos that have finished cloning are then converted completely with branches and tags as well as concurrently as soon as they are finished cloning\n15. Any parent repo that has finished must wait for all child repos to finish converting before it can begin converting so that all submodules will be added prior to completion\n16. The top scope/primary parent repo is then created or converted and all nested repos are added to it as submodules\n\n## Useful_Links\n\n  * [git svn crash course](http://git.or.cz/course/svn.html)\n  * [git-svn Manual](https://git-scm.com/docs/git-svn)\n  * [git blob cleaner](https://rtyley.github.io/bfg-repo-cleaner)\n  * [git-lfs extension](https://github.com/github/git-lfs)\n  * [git submodules](https://git-scm.com/book/en/v2/Git-Tools-Submodules)\n  * [github API](https://developer.github.com/v3/)\n\n\n## TODO\n\n  * Add full directory tree mirroring support for branches and tags\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcomcast%2Fsvn-to-github","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcomcast%2Fsvn-to-github","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcomcast%2Fsvn-to-github/lists"}