{"id":14065952,"url":"https://github.com/ctjacobs/git-rdm","last_synced_at":"2026-01-21T17:30:54.051Z","repository":{"id":75538363,"uuid":"61330602","full_name":"ctjacobs/git-rdm","owner":"ctjacobs","description":"A research data management plugin for the Git version control system.","archived":false,"fork":false,"pushed_at":"2017-10-24T08:58:07.000Z","size":43,"stargazers_count":33,"open_issues_count":0,"forks_count":1,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-06-27T02:38:46.724Z","etag":null,"topics":["curation","data","datasets","git","open-data","open-science","publishing","research-data-management","version-control"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ctjacobs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.md","dei":null,"publiccode":null,"codemeta":"codemeta.json","zenodo":null}},"created_at":"2016-06-16T22:48:36.000Z","updated_at":"2025-03-22T08:14:48.000Z","dependencies_parsed_at":"2023-06-06T20:15:23.904Z","dependency_job_id":null,"html_url":"https://github.com/ctjacobs/git-rdm","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/ctjacobs/git-rdm","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctjacobs%2Fgit-rdm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctjacobs%2Fgit-rdm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctjacobs%2Fgit-rdm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctjacobs%2Fgit-rdm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ctjacobs","download_url":"https://codeload.github.com/ctjacobs/git-rdm/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ctjacobs%2Fgit-rdm/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264184707,"owners_count":23569942,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["curation","data","datasets","git","open-data","open-science","publishing","research-data-management","version-control"],"created_at":"2024-08-13T07:04:52.097Z","updated_at":"2026-01-21T17:30:49.030Z","avatar_url":"https://github.com/ctjacobs.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Git-RDM\n\nGit-RDM is a Research Data Management (RDM) plugin for the [Git](https://git-scm.com/) version control system. It interfaces Git with data hosting services to manage the curation of version controlled files using persistent, citable repositories. This facilitates the sharing of research outputs and encourages a more open workflow within the research community.\n\nMuch like the standard Git commands, Git-RDM allows users to add/remove files within a 'publication staging area'. When ready, users can readily publish these staged files to a data repository hosted either by Figshare or Zenodo via the command line. Details of the files and their associated publication(s) are then recorded in a local SQLite database, including the specific Git revision (in the form of a SHA-1 hash), publication date/time, and the DOI, such that a full history of data publication is maintained.\n\n## Dependencies\n\nGit-RDM mostly relies on the standard Python modules and, of course, Git. However, two extra modules are needed:\n\n* [GitPython](https://gitpython.readthedocs.io), to access the Git repository's information.\n* [PyRDM](https://github.com/pyrdm/pyrdm), to handle the publishing of files.\n\nBoth of these dependencies can be installed via `pip` using\n\n```\nsudo pip install -r requirements.txt\n```\n\nNote that once PyRDM is installed, you will need to setup Figshare/Zenodo authentication tokens and copy them into the PyRDM configuration file in order to publish your data. See the [PyRDM documentation](https://pyrdm.readthedocs.io/en/latest/getting_started.html) for instructions on how to do this.\n\n## Installing\n\nAfter downloading or cloning this software using\n\n```\ngit clone https://github.com/ctjacobs/git-rdm.git\n```\n\na system-wide installation can be achieved by navigating to the git-rdm directory\n\n```\ncd git-rdm\n```\n\nand running\n\n```\nsudo python setup.py install\n```\n\nAlternatively, a local user installation can be achieved using\n\n```\npython setup.py install --prefix=/path/to/custom/install/directory\n```\n\nand adding `/path/to/custom/install/directory/bin` to the `PATH` environment variable:\n\n```\nexport PATH=$PATH:/path/to/custom/install/directory/bin\n```\n\nOnce Git-RDM is installed, Git should automatically detect the plugin and recognise the `rdm` command; for example, run `git rdm -h` to list the RDM-related subcommands described in the Usage section below.\n\n## Usage\n\nThe Git-RDM plugin comes with several subcommands. The following subsections demonstrate, with examples, how to use each of them. \n\n### git rdm init\n\nIn order to start using Git-RDM, the command `git rdm init` must first be run within the Git repository containing the data files to be published. This creates a new directory called `.rdm` containing a database file `publications.db`. All data publication details are stored within this file. Note that this command is similar to `git init` which initialises a new Git repository and creates the `.git` control directory. As an example, consider the `test` directory below, containing files `test1.txt`, `test2.txt` and `test3.png`:\n\n```\n~/test $ git rdm init\n~/test $ ls -lrta\ntotal 68\ndrwx------ 60 christian christian 20480 Jun 12 23:39 ..\n-rw-r--r--  1 christian christian     5 Jun 12 23:39 test1.txt\n-rw-r--r--  1 christian christian     5 Jun 12 23:39 test2.txt\n-rw-r--r--  1 christian christian     5 Jun 12 23:39 test3.png\ndrwxr-xr-x  7 christian christian  4096 Jun 12 23:40 .git\ndrwxr-xr-x  4 christian christian  4096 Jun 12 23:40 .\ndrwxr-xr-x  2 christian christian  4096 Jun 12 23:40 .rdm\n```\n\n### git rdm add\n\nOnce the RDM database has been initialised, data files may be added to the 'publication staging area' using `git rdm add` as follows:\n\n```\n~/test $ git rdm add test*\n~/test $ git rdm ls\ngit-rdm INFO: Files staged for publishing:\ngit-rdm INFO: \t/home/christian/test/test1.txt\ngit-rdm INFO: \t/home/christian/test/test3.png\ngit-rdm INFO: \t/home/christian/test/test2.txt\n```\n\nThe file being added for publication must first have been committed within the Git repository, otherwise Git-RDM will refuse to add it.\n\n### git rdm rm\n\nFiles can also be removed from the publication staging area using `git rdm rm`:\n\n```\n~/test $ git rdm rm test*\n```\n\n### git rdm publish\n\nOnce all the files are ready to be published, the `git rdm publish` command can be used to publish the files to a data repository hosted by a particular service. The hosting service must be specified as an argument, and can be either `figshare` or `zenodo`. Support for new services can be readily added by extending the [PyRDM library](https://pyrdm.readthedocs.io). Some basic publication information is obtained from the user, for example the title, description, and keyword metadata. PyRDM then interfaces with the hosting service and publishes the data files:\n\n```\n~/test $ git rdm publish figshare\nPrivate publication? (y/n): y\ngit-rdm INFO: Publishing as a private repository...\nTitle: Test Article\nDescription: Testing \nTags/keywords (in list format [\"a\", \"b\", \"c\"]): [\"hello\", \"world\"]\npyrdm.figshare INFO: Testing Figshare authentication...\npyrdm.figshare DEBUG: Server returned response 200\npyrdm.figshare INFO: Authentication test successful.\n\npyrdm.publisher INFO: Publishing data...\npyrdm.publisher INFO: Creating new fileset...\npyrdm.publisher INFO: Adding category...\npyrdm.publisher INFO: Fileset created with ID: 3428222 and DOI: 10.6084/m9.figshare.3428222\npyrdm.publisher DEBUG: The following files have been marked for uploading: ['/home/christian/test/test1.txt', '/home/christian/test/test3.png', '/home/christian/test/test2.txt']\npyrdm.publisher INFO: Uploading /home/christian/test/test1.txt...\npyrdm.publisher INFO: Uploading /home/christian/test/test3.png...\npyrdm.publisher INFO: Uploading /home/christian/test/test2.txt...\npyrdm.publisher INFO: All files successfully uploaded.\n```\n\nThe publication information is stored in the local database, and can be viewed using `git rdm ls`. Note that Git-RDM currently publishes the files using the current `HEAD` revision of the Git repository, and not the revision at which the files were first added using `git rdm add`.\n\n### git rdm ls\n\n`git rdm ls` is used to list and keep track of which data files have been published, and which files are still in the staging area. Users can choose to list each file, followed by any DOIs associated with it (by default) as follows:\n\n```\n~/test $ git rdm ls\ngit-rdm INFO: Published files:\ngit-rdm INFO: \t/home/christian/test/test1.txt\ngit-rdm INFO: \t\t10.6084/m9.figshare.3428222 (2016-06-13 @ 00:29:03, revision '1eeccabba810b8c91eef82e692713fdb05ca4a32')\ngit-rdm INFO: \t/home/christian/test/test2.txt\ngit-rdm INFO: \t\t10.6084/m9.figshare.3428222 (2016-06-13 @ 00:29:03, revision '1eeccabba810b8c91eef82e692713fdb05ca4a32')\ngit-rdm INFO: \t/home/christian/test/test3.png\ngit-rdm INFO: \t\t10.6084/m9.figshare.3428222 (2016-06-13 @ 00:29:03, revision '1eeccabba810b8c91eef82e692713fdb05ca4a32')\n```\n\nUsers can also choose to list the DOIs first and the files associated with it afterwards:\n\n```\n~/test $ git rdm ls --by-doi\ngit-rdm INFO: Published files:\ngit-rdm INFO: \t10.6084/m9.figshare.3428222\ngit-rdm INFO: \t\t/home/christian/test/test1.txt (2016-06-13 @ 00:29:03, revision '1eeccabba810b8c91eef82e692713fdb05ca4a32')\ngit-rdm INFO: \t\t/home/christian/test/test3.png (2016-06-13 @ 00:29:03, revision '1eeccabba810b8c91eef82e692713fdb05ca4a32')\ngit-rdm INFO: \t\t/home/christian/test/test2.txt (2016-06-13 @ 00:29:03, revision '1eeccabba810b8c91eef82e692713fdb05ca4a32')\n```\n\nTo check the raw, unformatted contents of the entire publications database, use the `--raw` flag:\n\n```\n~/test $ git rdm ls --raw\ngit-rdm INFO: Database dump:\ngit-rdm INFO: id, path, date, time, sha, pid, doi\ngit-rdm INFO: 13, /home/christian/test/test1.txt, 2016-06-13, 00:29:03.016951, 1eeccabba810b8c91eef82e692713fdb05ca4a32, 3428222, 10.6084/m9.figshare.3428222\ngit-rdm INFO: 14, /home/christian/test/test3.png, 2016-06-13, 00:29:03.016951, 1eeccabba810b8c91eef82e692713fdb05ca4a32, 3428222, 10.6084/m9.figshare.3428222\ngit-rdm INFO: 15, /home/christian/test/test2.txt, 2016-06-13, 00:29:03.016951, 1eeccabba810b8c91eef82e692713fdb05ca4a32, 3428222, 10.6084/m9.figshare.3428222\n```\n\n### git rdm show\n\nThe full publication record maintained by the data repository service can be shown using `git rdm show`. It expects two arguments: the name of the hosting service (`figshare` or `zenodo`) and the publication ID. For example, for the publication whose Figshare publication ID is 3428222 (and DOI is `10.6084/m9.figshare.3428222`), the (truncated) output is:\n\n```\n~/test $ git rdm show figshare 3428222\npyrdm.figshare INFO: Testing Figshare authentication...\npyrdm.figshare DEBUG: Server returned response 200\npyrdm.figshare INFO: Authentication test successful.\n\ngit-rdm INFO: {\n    \"authors\": [\n        {\n            \"full_name\": \"Christian T. Jacobs\",\n            \"id\": 554577,\n            \"is_active\": true,\n            \"orcid_id\": \"0000-0002-0034-4650\",\n            \"url_name\": \"Christian_T_Jacobs\"\n        }\n    ],\n    \"categories\": [\n        {\n            \"id\": 2,\n            \"title\": \"Uncategorized\"\n        }\n    ],\n    \"citation\": \"Jacobs, Christian T. (): Test Article. figshare.\\n 10.6084/m9.figshare.3428222\\n Retrieved: 23 32, Jun 12, 2016 (GMT)\",\n    \"confidential_reason\": \"\",\n    \"created_date\": \"2016-06-12T23:28:54Z\",\n    \"custom_fields\": [],\n    \"defined_type\": 4,\n    \"description\": \"Testing\",\n    \"doi\": \"10.6084/m9.figshare.3428222\",\n```\n\n## License\nThis software is released under the MIT license. See the file called `LICENSE` for more information.\n\n## Citing\n\nIf you use Git-RDM during the course of your research, please consider citing the following paper:\n\n* C. T. Jacobs, A. Avdis (2016). Git-RDM: A research data management plugin for the Git version control system. *The Journal of Open Source Software*, 1(2), DOI: [10.21105/joss.00029](http://dx.doi.org/10.21105/joss.00029)\n\n## Contact\nPlease send any questions or comments about Git-RDM via email to \u003cC.T.Jacobs@soton.ac.uk\u003e.\n\nAny bugs should be reported using the project's [issue tracker](http://github.com/ctjacobs/git-rdm/issues). If possible, please run Git-RDM with debugging enabled using the `-d` flag after `git rdm` (e.g. `git rdm -d publish figshare`) and provide the full output.\n\nContributions are welcome and should be made via a pull request.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fctjacobs%2Fgit-rdm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fctjacobs%2Fgit-rdm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fctjacobs%2Fgit-rdm/lists"}