{"id":21375381,"url":"https://github.com/lsst/rucio_register","last_synced_at":"2025-10-25T16:41:09.543Z","repository":{"id":210272986,"uuid":"726175454","full_name":"lsst/rucio_register","owner":"lsst","description":"Routines and commands to add Butler specific information to Rucio metadata","archived":false,"fork":false,"pushed_at":"2024-10-24T09:07:19.000Z","size":218,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":10,"default_branch":"main","last_synced_at":"2024-10-25T01:51:04.559Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lsst.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-12-01T17:39:04.000Z","updated_at":"2024-10-14T18:26:54.000Z","dependencies_parsed_at":"2023-12-12T17:52:24.599Z","dependency_job_id":"c68123bf-d7d1-466b-ad46-d9cadf8575ec","html_url":"https://github.com/lsst/rucio_register","commit_stats":null,"previous_names":["lsst-dm/dm_replica","lsst/rucio_register"],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lsst%2Frucio_register","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lsst%2Frucio_register/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lsst%2Frucio_register/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lsst%2Frucio_register/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lsst","download_url":"https://codeload.github.com/lsst/rucio_register/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225871094,"owners_count":17537173,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-22T09:10:11.636Z","updated_at":"2025-10-25T16:41:04.504Z","avatar_url":"https://github.com/lsst.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# rucio-register\nCommand and API to add Butler specific information to Rucio metadata.\n\nThis is a guide to using the rucio-register command for registering\nButler files with Rucio.\n\nButler files are expected to be located in a Rucio directory structure,\nbelow a directory named for a Rucio scope. For example, if the root of\nthe Rucio directory is \"/rucio/disks/xrd1/rucio\" and the Rucio scope\nis \"test\", the files should be located below \"/rucio/disks/xrd1/rucio/test\".\n\n\n## Example\n\nThe command  \"rucio-register\" registers files with Rucio. This\ncommand requires a YAML configuration file which specifies the Rucio rse and\nscope, as well as the root of the directory where files are deposited,\nand the external reference to the Rucio RSE. This configuration file\ncan be specified on the command line, or in the environment\nvariable **RUCIO_REGISTER_CONFIG**.\n\nThe command can register data-products or raws:\n\nfor data products:\n```\nrucio-register data-products --log-level INFO -r /rucio/disks/xrd1/rucio/test -c HSC/runs/RC2/w_2023_32/DM-40356/20230814T170253Z -t visitSummary -d rubin_dataset -C register_config.yaml\n```\n\nfor raws:\n```\nrucio-register raws --log-level INFO -r /rucio/disks/xrd1/rucio/test -d rubin_dataset --collections LATISS/raw/all -C register_config.yaml \\*\n```\nNote that for raws, this is similar to how one uses the butler command\n\nThis command looks for files registered in the butler repo \"/repo/main\"\nusing the \"dataset-type\" and \"collections\" arguments to query the butler. Note\nthat the repo name's suffix is the Rucio \"scope\". In this example, that scope\nis \"main\".\n\nThe resulting datasets' files are registered with Rucio, as specified in\nthe \"config.yaml\" file.  Additionally, those files are registered with the\nRucio dataset specified by the \"rucio-dataset\" argument.\n\nfor zip files:\n```\nrucio-register zips -d rubin_dataset --log-level INFO -C /home/lsst/rucio_register/examples/register_config.yaml --zip-file file:///rucio/disks/xrd1/rucio/test/something/2c8f9e54-9757-54c0-9119-4c3ac812a2da.zip\n```\nNote for zip files, register a single zip file at a time.\n\nfor dimension record YAML files:\n```\nrucio-register dimensions -d rubin_dataset --log-level INFO -C /home/lsst/rucio_register/examples/register_config.yaml --dimension-file file:///rucio/disks/xrd1/rucio/test/something/dimensions.yaml\n```\nNote for zip files, register a single zip file at a time.\n\n\n\n## config.yaml\n\nThe config.yaml file includes information which specifies the Rucio RSE\nto use, the Rucio scope, the local root of the RSE, and the URL prefix\nof the location where Rucio stores the files.\n\n\n```\nrucio_rse: \"XRD1\"\nscope: \"main\"\nrse_root: \"/rucio/disks/xrd1/rucio\"\ndtn_url: \"root://xrd1:1094//rucio\"\n```\n\n\n# export-datasets\nCommand and to dump Butler dataset, dimension, and calibration validity range data to a YAML file.\n\nThis command works alongside \"rucio-register\".\nIt can be used to record all the files registered into Rucio so that their transfer and ingestion at the destination can be confirmed.\nIn addition, it preserves dimension data and calibration validity range data that is not otherwise transferred via Rucio.\nThis additional data can be useful for repeated ingests of raw and calibration data into Butler repositories.\n\n## Examples\n\nTo record the dimension values (notably _not_ including the visit dimension, which would have to be regenerated) for a set of raw images:\n\n```\nexport-datasets \\\n    --root /sdf/group/rubin/lsstdata/offline/instrument/ \\\n    --filename Dataset-LSSTCam-NoTract-20250101-0000.yaml \\\n    --collections LSSTCam/raw/all \\\n    --where \"instrument='LSSTCam' and day_obs=20250101 and exposure.seq_num IN (1..99)\" \\\n    --limit 30000 \\\n    /repo/main raw\n```\n`--root` is needed here since the original files are ingested as full URLs with `direct`.\n\nTo record the datasets created by a multi-site processing workflow:\n\n```\nexport-datasets \\\n    --filename Dataset-LSSTCam-Tract2024-Step3-Group5-metadata.yaml \\\n    --collections step3/group5 \\\n    --where \"tract=2024\" \\\n    $LOCAL_REPO '*_metadata'\n```\nNote the use of a glob pattern to select dataset types of interest.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flsst%2Frucio_register","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flsst%2Frucio_register","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flsst%2Frucio_register/lists"}