{"id":23473099,"url":"https://github.com/caltechlibrary/inveniordm-migrate","last_synced_at":"2025-04-13T05:50:09.073Z","repository":{"id":58810754,"uuid":"245236681","full_name":"caltechlibrary/inveniordm-migrate","owner":"caltechlibrary","description":"Scripts to migrate content into Invenio RDM","archived":false,"fork":false,"pushed_at":"2023-04-11T22:45:57.000Z","size":934,"stargazers_count":2,"open_issues_count":1,"forks_count":0,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-04-13T05:50:04.217Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/caltechlibrary.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGES.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-03-05T18:18:16.000Z","updated_at":"2022-11-08T17:11:08.000Z","dependencies_parsed_at":"2023-01-21T09:02:26.468Z","dependency_job_id":null,"html_url":"https://github.com/caltechlibrary/inveniordm-migrate","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":"caltechlibrary/template","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caltechlibrary%2Finveniordm-migrate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caltechlibrary%2Finveniordm-migrate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caltechlibrary%2Finveniordm-migrate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caltechlibrary%2Finveniordm-migrate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/caltechlibrary","download_url":"https://codeload.github.com/caltechlibrary/inveniordm-migrate/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248670501,"owners_count":21142901,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-24T17:15:14.289Z","updated_at":"2025-04-13T05:50:09.051Z","avatar_url":"https://github.com/caltechlibrary.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Assorted scripts to migrate content to InvenioRDM and S3 data sources\n=====================================================\n\nThis repo holds scripts user to migrate content into InvenioRDM. These have\ngenerally been used for one-time migration activities, but may be useful in the\nfuture.\n\n[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg?style=flat-square)](https://choosealicense.com/licenses/bsd-3-clause)\n[![Latest release](https://img.shields.io/github/v/release/caltechlibrary/inveniordm-migrate.svg?style=flat-square\u0026color=b44e88)](https://github.com/caltechlibrary/inveniordm-migrate/releases)\n\n\nTable of contents\n-----------------\n\n* [Usage](#usage)\n* [Getting help](#getting-help)\n* [License](#license)\n* [Authors and history](#authors-and-history)\n* [Acknowledgments](#authors-and-acknowledgments)\n\n\nUsage\n-----\n\n\n### CaltechDATA\n\n`migrate_caltechdata.py` was usilized to move records from the TIND-managed\nInvenio instance to InvenioRDM\n\n### CaltechTHESIS\n\n`migrate_caltechthesis.py` was utilized to creats some minimal test records in\nInvenioRDM. It is not complete.\n\n### OSN Migration\n\nFor large collections of data we sometimes need to move the data first, and\nthen create InvenioRDM records. An S3 object store like the Open Storage\nNetwork is a great option. You can bulk move records efficiently with\n[s5cmd](https://github.com/peak/s5cmd) and the management scripts.\n\nRun `python make_command.py` to generate a list of files to sync. You'll need\nto set environment variables with\n\n```\nAWS_ACCESS_KEY_ID\nAWS_SECRET_ACCESS_KEY\nS3_ENDPOINT_URL https://renc.osn.xsede.org\nAWS_REGION us-east-1\n```\n\nThen run the command with \n`nohup ./s5cmd -numworkers 100 run commands.txt \u003e\u003e \u0026 log2017.txt ; echo Done \u003e\u003e \u0026 log2017.txt \u0026`.\nYou may be able to adjust the numworkers component depending on the OS.\n\n\nGetting help\n------------\n\nRaise an issue on the issue tacker.\n\n\nLicense\n-------\n\nSoftware produced by the Caltech Library is Copyright (C) 2023, Caltech.  This software is freely distributed under a BSD/MIT type license.  Please see the [LICENSE](LICENSE) file for more information.\n\n\nAuthors and history\n---------------------------\n\nThese scripts were written by Tom Morrell.\n\nAcknowledgments\n---------------\n\nThis work was funded by the California Institute of Technology Library.\n\n\n\u003cdiv align=\"center\"\u003e\n  \u003cbr\u003e\n  \u003ca href=\"https://www.caltech.edu\"\u003e\n    \u003cimg width=\"100\" height=\"100\" src=\".graphics/caltech-round.png\"\u003e\n  \u003c/a\u003e\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaltechlibrary%2Finveniordm-migrate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcaltechlibrary%2Finveniordm-migrate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaltechlibrary%2Finveniordm-migrate/lists"}