{"id":34701828,"url":"https://github.com/hubmapconsortium/ingest-pipeline","last_synced_at":"2026-04-28T20:00:58.162Z","repository":{"id":37030676,"uuid":"212677995","full_name":"hubmapconsortium/ingest-pipeline","owner":"hubmapconsortium","description":"Data ingest pipeline(s) for QA/metadata etl/post-processing","archived":false,"fork":false,"pushed_at":"2026-04-22T19:57:37.000Z","size":4094,"stargazers_count":4,"open_issues_count":84,"forks_count":6,"subscribers_count":19,"default_branch":"devel","last_synced_at":"2026-04-22T21:32:27.932Z","etag":null,"topics":["ot2od030545"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hubmapconsortium.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2019-10-03T20:52:37.000Z","updated_at":"2026-04-22T03:02:51.000Z","dependencies_parsed_at":"2023-02-18T11:46:41.643Z","dependency_job_id":"acfd0c09-ccae-4663-be84-1f937e5882b4","html_url":"https://github.com/hubmapconsortium/ingest-pipeline","commit_stats":null,"previous_names":[],"tags_count":264,"template":false,"template_full_name":null,"purl":"pkg:github/hubmapconsortium/ingest-pipeline","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fingest-pipeline","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fingest-pipeline/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fingest-pipeline/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fingest-pipeline/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hubmapconsortium","download_url":"https://codeload.github.com/hubmapconsortium/ingest-pipeline/tar.gz/refs/heads/devel","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fingest-pipeline/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32396781,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-28T19:38:08.556Z","status":"ssl_error","status_checked_at":"2026-04-28T19:37:55.688Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ot2od030545"],"created_at":"2025-12-24T22:53:06.505Z","updated_at":"2026-04-28T20:00:58.116Z","avatar_url":"https://github.com/hubmapconsortium.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# HuBMAP Ingest Pipeline\n\n## About\n\nThis repository implements the internals of the HuBMAP data repository\nprocessing pipeline. This code is independent of the UI but works in\nresponse to requests from the data-ingest UI backend.\n\n## Using the devtest assay type\n\n*devtest* is a mock assay for use by developers.  It provides a testing tool controlled by a simple YAML file, allowing a developer to simulate execution of a full ingest pipeline without the need for real data.  To do a devtest run, follow this procedure.\n\n1) Create an input dataset, for example using the ingest UI.\n  - It must have a valid Source ID.\n  - Its datatype must be Other -\u003e devtest\n2) Insert a control file named *test.yml* into the top-level directory of the dataset.  The file format is described below.  You may include any other files in the directory, as long as test.yml exists.\n3) Submit the dataset.\n\nIngest operations will proceed normally from that point:\n1) The state of the original dataset will change from New through Processing to QA.\n2) A secondary dataset will be created, and will move through Processing to QA with an adjustable delay (see below).\n3) Files specified in *test.yml* may be moved into the dataset directory of the secondary dataset.\n4) All normal metadata will be returned, including extra metadata specified in *test.yml* (see below).\n\nThe format for *test.yml* is:\n```\n{\n  # the following line is required for the submission to be properly identified at assay 'devtest'\n  collectiontype: devtest,\n  \n  # The pipeline_exec stage will delay for this many seconds before returning (default 30 seconds)\n  delay_sec: 120,\n  \n  # If this list is present, the listed files will be copied from the submission directory to the derived dataset.\n  files_to_copy: [\"file_068.bov\", \"file_068.doubles\"],\n  \n  # If present, the given metadata will be returned as dataset metadata for the derived dataset.\n  metadata_to_return: {\n    mymessage: 'hello world',\n    othermessage: 'and also this'\n  }\n}\n\n```\n\n## API\n\n| \u003cstrong\u003eAPI Test\u003c/strong\u003e         |                                          |\n|------------------|------------------------------------------|\n| Description      | Test that the API is available           |\n| HTTP Method      | GET                                      |\n| Example URL      | /api/hubmap/test                         |\n| URL Parameters   | None                                     |\n| Data Parameters  | None                                     |\n| Success Response | Code: 200\u003cbr\u003e Content: {\"api_is_alive\":true} |\n| Error Responses  | None                                     |\n\n| \u003cstrong\u003eGet Process Strings\u003c/strong\u003e         |                                          |\n|------------------|------------------------------------------|\n| Description      | Get a list of valid process identifier keys            |\n| HTTP Method      | GET                                      |\n| Example URL      | /api/hubmap/get_process_strings                         |\n| URL Parameters   | None                                     |\n| Data Parameters  | None                                     |\n| Success Response | Code: 200\u003cbr\u003e Content: {\"process_strings\":[...list of keys...]} |\n| Error Responses  | None                                     |\n\n| \u003cstrong\u003eGet Version Information\u003c/strong\u003e         |                                          |\n|------------------|------------------------------------------|\n| Description      | Get API version information           |\n| HTTP Method      | GET                                      |\n| Example URL      | /api/hubmap/version                       |\n| URL Parameters   | None                                     |\n| Data Parameters  | None                                     |\n| Success Response | Code: 200\u003cbr\u003e Content: {\"api\":API version, \"build\":build version} |\n| Error Responses  | None                                     |\n\n| \u003cstrong\u003eRequest Ingest\u003c/strong\u003e   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |\n|------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|\n| Description      | Cause a workflow to be applied to a dataset in the LZ. The full dataset path is computed from the data parameters.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |\n| HTTP Method      | POST                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |\n| Example URL      | /api/hubmap/request_ingest                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |\n| URL Parameters   | None                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |\n| Data Parameters  | provider : one of a known set of providers, e.g. 'Vanderbilt'\u003cbr\u003e submission_id : unique identifier string for the data submission\u003cbr\u003e process : one of a known set of process names, e.g. 'MICROSCOPY.IMS.ALL'                                                                                                                                                                                                                                                                                                                                                                                                                                                      |\n| Success Response | Code: 200\u003cbr\u003e Content:{\u003cbr\u003e\"ingest_id\":\"some_unique_string\",\u003cbr\u003e \"run_id\":\"some_other_unique_string\"\u003cbr\u003e}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |\n| Error Responses  | Bad Request:\u003cbr\u003e\u0026nbsp;    Code: 400\u003cbr\u003e   \u0026nbsp; Content strings:\u003cbr\u003e\u0026nbsp; \u0026nbsp;  \"Must specify provider to request data be ingested\"\u003cbr\u003e\u0026nbsp; \u0026nbsp;      \"Must specify sample_id to request data be ingested\"\u003cbr\u003e\u0026nbsp; \u0026nbsp;      \"Must specify process to request data be ingested\"\u003cbr\u003e\u0026nbsp; \u0026nbsp;      \"_NAME_ is not a known ingestion process\" \u003cbr\u003eUnauthorized:\u003cbr\u003e\u0026nbsp;    Code: 401\u003cbr\u003e\u0026nbsp;    Content strings:\u003cbr\u003e\u0026nbsp; \u0026nbsp;      \"You are not authorized to use this resource\"\u003cbr\u003e Not Found:\u003cbr\u003e\u0026nbsp;    Code: 404\u003cbr\u003e\u0026nbsp;    Content strings:\u003cbr\u003e\u0026nbsp; \u0026nbsp;      \"Resource not found\"\u003cbr\u003e\u0026nbsp; \u0026nbsp;      \"Dag id _DAG_ID_ not found\" \u003cbr\u003eServer Error:\u003cbr\u003e\u0026nbsp;    Code: 500\u003cbr\u003e\u0026nbsp;    Content strings:\u003cbr\u003e\u0026nbsp; \u0026nbsp;      \"An unexpected problem occurred\"\u003cbr\u003e\u0026nbsp; \u0026nbsp;      \"The request happened twice?\"\u003cbr\u003e\u0026nbsp; \u0026nbsp;      \"Attempt to trigger run produced an error: _ERROR_\" |\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhubmapconsortium%2Fingest-pipeline","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhubmapconsortium%2Fingest-pipeline","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhubmapconsortium%2Fingest-pipeline/lists"}