{"id":23442297,"url":"https://github.com/repronim/reproin","last_synced_at":"2025-04-03T03:11:08.070Z","repository":{"id":70236734,"uuid":"120343858","full_name":"ReproNim/reproin","owner":"ReproNim","description":"A setup for automatic generation of shareable, version-controlled BIDS datasets from MR scanners","archived":false,"fork":false,"pushed_at":"2025-03-21T15:35:45.000Z","size":1730,"stargazers_count":49,"open_issues_count":40,"forks_count":15,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-03-24T08:44:15.632Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ReproNim.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-02-05T18:24:04.000Z","updated_at":"2025-03-21T15:35:49.000Z","dependencies_parsed_at":"2024-12-23T17:30:14.705Z","dependency_job_id":"c343c3f3-4c2b-4a8b-b7c6-add37ba724df","html_url":"https://github.com/ReproNim/reproin","commit_stats":{"total_commits":115,"total_committers":4,"mean_commits":28.75,"dds":"0.31304347826086953","last_synced_commit":"b3bbe0e21919e6d3eb407dbc15d66d2821ee2b0e"},"previous_names":[],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReproNim%2Freproin","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReproNim%2Freproin/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReproNim%2Freproin/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ReproNim%2Freproin/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ReproNim","download_url":"https://codeload.github.com/ReproNim/reproin/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246927835,"owners_count":20856198,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-23T17:29:18.153Z","updated_at":"2025-04-03T03:11:08.044Z","avatar_url":"https://github.com/ReproNim.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![DOI](https://zenodo.org/badge/120343858.svg)](https://zenodo.org/badge/latestdoi/120343858)\n\n# ReproIn\n\nThis project is a part of the [ReproNim Center](http://ReproNim.org)\nsuite of tools and frameworks.  Its goal is to provide a\nturnkey flexible setup for automatic generation of shareable,\nversion-controlled BIDS datasets from MR scanners.  To not reinvent the wheel,\nall actual software development is largely done through contribution to\nexisting software projects:\n\n- [HeuDiConv]:\n  a flexible DICOM converter for organizing brain imaging data into structured\n  directory layouts.\n  ReproIn [heuristic] was developed and now is shipped within HeuDiConv,\n  so it could be used independently of the ReproIn setup on any HeuDiConv\n  installation (specify `-f reproin` to heudiconv call).\n- [DataLad]:\n  a modular version control platform and distribution for both code and\n  data.  DataLad support was contributed to HeuDiConv, and could be\n  enabled by adding `--datalad` option to the `heudiconv` call.\n\n## Specification\n\nThe header of the [heuristic] file describes details of the\nspecification on how to organize and name study sequences at MR console.\n\nIf you like to use a GUI for crafting the names, consider using [@NPACore's](https://github.com/NPACore) [ReproIn namer](https://npacore.github.io/reproin-namer/#) website.\n\n## Overall workflow\n\nSchematic description of the overall setup:\n\n![Setup](docs/source/images/dbic-flow.png)\n\n**Note:** for your own setup, [dcm2niix](https://github.com/rordenlab/dcm2niix)\n[author](https://github.com/neurolabusc)\n[recommends](https://github.com/neurolabusc/dcm_qa_agfa) to avoid dcm4che and\nchoose another PACS.\n\n![Setup](docs/source/images/dbic-conversions.png)\n\n## Tutorial/HOWTO\n\n### Data collection\n\n#### Making your sequence compatible with ReproIn heuristic\n\n- [Walkthrough #1](docs/walkthrough-1.md): guides you through\nReproIn approach to organizing exam cards and managing canceled runs/sessions\non Siemens scanner(s)\n\n#### Renaming sequences to conform the specification needed by ReproIn\n\nTODO: Describe how sequences could be renamed per study by creating a derived\nheuristic\n\n### Conversion\n\n1. Install [HeuDiConv] and [DataLad]: e.g.\n   `apt-get update; apt-get install heudiconv datalad` in any NeuroDebian environment.\n   If you do not have one, you could get either of\n   - [NeuroDebian Virtual Machine](http://neuro.debian.net/vm.html)\n   - ReproIn Docker image: `docker run -it --rm -v $PWD:$PWD repronim/reproin`\n   - ReproIn Singularity image: you can either\n     - convert from the docker image: `singularity pull docker://repronim/reproin`\n     - download the most recent version from\n       http://datasets.datalad.org/?dir=/repronim/containers/images/repronim\n\t   which is a DataLad dataset which you can install via `datalad install ///repronim/containers`\n       (see/subscribe https://github.com/ReproNim/reproin/issues/64\n       for HOWTO setup YODA style dataset)\n2. Collect a subject/session (or multiple of them) while placing and\n   naming sequences in the scanner following the [specification].\n   But for now we will assume that you have no such dataset yet, and\n   want to try on phantom data:\n\n        datalad install -J3 -r -g ///dicoms/dartmouth-phantoms/bids_test4-20161014\n\n   to get all subdatasets recursively, while getting the data as well\n   in parallel 3 streams.\n   This dataset is a sample of multi-session acquisition with anatomicals and\n   functional sequences on a friendly phantom impersonating two different\n   subjects (note: fieldmaps were deficient, without magnitude images).\n   You could also try other datasets such as [///dbic/QA]\n\n3. We are ready to convert all the data at once (heudiconv will sort\n   into accessions) or one accession at a time.\n   The recommended invocation for the heudiconv is\n\n        heudiconv -f reproin --bids --datalad -o OUTPUT --files INPUT\n\n   to convert all found in `INPUT` DICOMs and place then within the\n   hierarchy of DataLad datasets rooted at `OUTPUT`.  So we will start\n   with a single accession of `phantom-1/`\n\n        heudiconv -f reproin --bids --datalad -o OUTPUT --files bids_test4-20161014/phantom-1\n\n   and inspect the result under OUTPUT, probably best with `datalad ls`\n   command:\n\n        ... WiP ...\n\n\n\n#### HeuDiConv options to overload autodetected variables:\n\n- `--subject`\n- `--session`\n- `--locator`\n\n\n\n## Sample converted datasets\n\nYou could find sample datasets with original DICOMs\n\n- [///dbic/QA] is a publicly\n  available DataLad dataset with historical data on QA scans from DBIC.\n  You could use DICOM tarballs under `sourcedata/` for your sample\n  conversions.\n  TODO: add information from which date it is with scout DICOMs having\n  session identifier\n- [///dicoms/dartmouth-phantoms](http://datasets.datalad.org/?dir=/dicoms/dartmouth-phantoms)\n  provides a collection of datasets acquired at [DBIC] to establish\n  ReproIn specification.  Some earlier accessions might not be following\n  the specification.\n  [bids_test4-20161014](http://datasets.datalad.org/?dir=/dicoms/dartmouth-phantoms/bids_test4-20161014)\n  provides a basic example of multi-subject and multi-session acquisition.\n\n## Containers/Images etc\n\nThis repository provides a [Singularity](./Singularity) environment\ndefinition file used to generate a complete environment needed to run\na conversion.  But also, since all work is integrated within the\ntools, any environment providing them would suffice, such as\n[NeuroDebian](https://neuro.debian.net) docker or Singularity images, virtual appliances, and\nother Debian-based systems with NeuroDebian repositories configured,\nwhich would provide all necessary for ReproIn setup components.\n\n## Getting started from scratch\n\n### Setup environment\n\nreproin script relies on having datalad, datalad-containers, and singularity\navailable.  The simplest way to get them all is to install a conda\ndistribution, e.g. miniforge ([link for\namd64](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh)),\nand setup the environment with all components installed:\n\n    mamba create -n reproin -y datalad datalad-container singularity\n\nNote that in future sessions you will need to activate this environment:\n\n    mamba activate reproin\n    \nThen make sure you have your git configured. If `git config --list` does not \ninclude these entries, add (adjust to fit your persona)\n\n    git config --global user.name  \"My Name\"\n    git config --global user.email  \"MyName@example.com\"\n\nand install the ReproNim/containers\n\n    datalad clone https://github.com/ReproNim/containers repronim-containers\n    cd repronim-containers\n\nwhich would clone the dataset from GitHub and auto-enable datasets.datalad.org\nremote to actually get annexed content of the images.\nNow fetch the image for the most recent version of reproin from under images/repronim, e.g.\n\n    datalad get images/repronim/repronim-reproin--0.13.1.sing\n    cd ..\n\n### \"Install\" reproin script\n\nThe singularity image we fetched already comes with reproin installed inside,\nbut to \"drive\" conversion we need to have `reproin` available in the base\nenvironment.  Because we do not have it (yet) packaged for conda\ndistribution, we will just clone this repository and gain access to the script:\n\n    git clone https://github.com/ReproNim/reproin\n\nTo avoid typing the full path to the `reproin` script, can do \n\n    export \"PATH=$PWD/reproin/bin/:$PATH\"\n\nto place it in the PATH.\n\nNB.  It is important ATM to not just `cp` that `reproin` script elsewhere\nbecause it relies on being able to find other resources made available in that\nrepository (e.g., `cfg_reproin_bids.py`).\n\n### \"Configure\" the reproin setup\n\nCurrently `reproin` script hardcodes the path to DICOMS to reside under\n`/inbox/DICOM` and extracted lists and converted data to reside under\n`/inbox/BIDS`.\nIt is possible to overload location for BIDS via `BIDS_DIR` env variable, so\nwe can do e.g.\n\n    export BIDS_DIR=$HOME/BIDS-demo\n\nand then let's create the top-level datalad dataset to contain all converted\ndata, configuring to store text files in git rather than git-annex,\n\n    datalad create -c text2git \"$BIDS_DIR\"\n\n### Collect DICOMs listing\n\nATM reproin container has an older version of the script, so to use newer version we would just bind mount our cloned script inside,\n\n    singularity run -e -c \\\n       --env BIDS_DIR=$BIDS_DIR \\\n       -B $HOME/reproin/bin/reproin:/usr/local/bin/reproin \\\n       -B /inbox/DICOM:/inbox/DICOM:ro \\\n       -B $BIDS_DIR:$BIDS_DIR \\\n       ~/repronim-containers/images/repronim/repronim-reproin--0.13.1.sing lists-update-study-shows\n\nwhich should output summary over the studies it found under /inbox/DICOM, e.g.\n\n    dbic/QA: new=16 no studydir yet\n    PI/Researcher/1110_SuperCool: new=12 no studydir yet\n\nand you should see a file appeared for the current  year and month under `$BIDS_DIR/reproin/lists`.\n\n### Create target dataset\n\nNow we can create \"studydir\" for the study of interest, e.g.\n\n    reproin study-create dbic/QA\n\nwhich would\n\n- create target BIDS dataset within the hierarchy\n- install repronim/containers borrowing the image from the `~/repronim-containers`\n- rerun `study-show` to output summary over the current state like\n\n    todo=4 done=0 /afs/.dbic.dartmouth.edu/usr/haxby/yoh/BIDS-demo/dbic/QA/.git/study-show.sh 2024-11-11\n\n### Convert the dataset\n\nGo to the folder of the dataset, e.g. \n\n    cd \"$BIDS_DIR/dbic/QA\"\n\nto see that `reproin` pre-setup everything needed to run conversion (`cat .datalad/config`).\nAnd now you should be able to run conversion for your study via \"datalad-container\"\nextension:\n\n    datalad containers-run -n repronim-reproin study-convert dbic/QA\n\n\n## Gotchas\n\n\n## Complete setup at DBIC\n\nIt relies on the hardcoded ATM in `reproin` locations and organization\nof DICOMs and location of where to keep converted BIDS datasets.\n\n- `/inbox/DICOM/{YEAR}/{MONTH}/{DAY}/A00{ACCESSION}`\n- `/inbox/BIDS/{PI}/{RESEARCHER}/{ID}_{name}/`\n\n### CRON job\n\n```\n# m h  dom mon dow   command\n55 */12 * * * $HOME/reproin-env-0.9.0 -c '~/proj/reproin/bin/reproin lists-update-study-shows' \u0026\u0026 curl -fsS -m 10 --retry 5 -o /dev/null https://hc-ping.com/61dfdedd-SENSORED\n```\n\nNB: that `curl` at the end is to make use of https://healthchecks.io\nto ensure that we do have CRON job ran as we expected.\n\nATM we reuse a singularity environment based on reproin 0.9.0 produced from this repo and shipped within ReproNim/containers. For the completeness sake\n\n```shell\n(reproin-3.8) [bids@rolando lists] \u003e cat $HOME/reproin-env-0.9.0\n#!/bin/sh\n\nenv -i /usr/local/bin/singularity exec -B /inbox -B /afs -H $HOME/singularity_home $(dirname $0)/reproin_0.9.0.simg /bin/bash \"$@\"\n```\n\nwhich produces emails with content like\n\n```\nWager/Wager/1102_MedMap: new=92 todo=5 done=102 /inbox/BIDS/Wager/Wager/1102_MedMap/.git/study-show.sh 2023-03-30\nPI/Researcher/ID_name: new=32 no studydir yet\nHaxby/Jane/1073_MonkeyKingdom: new=4 todo=39 done=8  fixups=6 /inbox/BIDS/Haxby/Jane/1073_MonkeyKingdom/.git/study-show.sh 2023-03-30\n```\n\nwhere as you can see it updates on the status for each study which was scanned for from the\nbeginning of the current month. And it ends with the pointer to `study-show.sh` script which\nwould provide details on already converted or heudiconv line invocations for what yet to do.\n\n### reproin study-create\n\nFor the \"no studydir yet\" we need first to generate study dataset (and\npossibly all leading `PI/Researcher` super-datasets via \n\n```shell\nreproin study-create PI/Researcher/ID_name\n```\n\n### reproin study-convert\n\nUnless there are some warnings/conflicts (subject/session already\nconverted, etc) are found,\n\n```shell\nreproin study-convert PI/Researcher/ID_name\n```\n\ncould be used to convert all new subject/sessions for that study.\n\n### XNAT\n\nAnonymization or other scripts might obfuscate \"Study Description\" thus ruining\n\"locator\" assignment.  See \n[issue #57](https://github.com/ReproNim/reproin/issues/57) for more information.\n\n## TODOs/WiP/Related\n\n- [ ] add a pre-configured DICOM receiver for fully turnkey deployments\n- [ ] [heudiconv-monitor] to fully automate conversion of the incoming\n      data\n- [ ] [BIDS dataset manipulation helper](https://github.com/INCF/bidsutils/issues/6)\n\n[HeuDiConv]: https://github.com/nipy/heudiconv\n[DataLad]: http://datalad.org\n[heuristic]: https://github.com/nipy/heudiconv/blob/master/heudiconv/heuristics/reproin.py\n[specification]: https://github.com/nipy/heudiconv/blob/master/heudiconv/heuristics/reproin.py\n[heudiconv-monitor]: https://github.com/nipy/heudiconv/blob/master/heudiconv/cli/monitor.py\n[DBIC]: http://dbic.dartmouth.edu\n[///dbic/QA]: http://datasets.datalad.org/?dir=/dbic/QA\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frepronim%2Freproin","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frepronim%2Freproin","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frepronim%2Freproin/lists"}