{"id":34701800,"url":"https://github.com/hubmapconsortium/soft-assay-rules","last_synced_at":"2026-05-26T11:35:20.890Z","repository":{"id":213613206,"uuid":"734511684","full_name":"hubmapconsortium/soft-assay-rules","owner":"hubmapconsortium","description":"Rules for \"soft\" assay classification, and tools to generate and test them.","archived":false,"fork":false,"pushed_at":"2025-08-21T20:20:29.000Z","size":327,"stargazers_count":0,"open_issues_count":2,"forks_count":0,"subscribers_count":13,"default_branch":"main","last_synced_at":"2025-08-21T22:46:25.253Z","etag":null,"topics":["ot2od030545"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hubmapconsortium.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2023-12-21T21:54:11.000Z","updated_at":"2025-07-25T16:56:59.000Z","dependencies_parsed_at":"2024-01-22T16:12:16.949Z","dependency_job_id":"646c8f9f-da23-4f37-8328-4692c6140e0b","html_url":"https://github.com/hubmapconsortium/soft-assay-rules","commit_stats":null,"previous_names":["hubmapconsortium/soft-assay-rules"],"tags_count":0,"template":false,"template_full_name":"hubmapconsortium/hubmap-template","purl":"pkg:github/hubmapconsortium/soft-assay-rules","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fsoft-assay-rules","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fsoft-assay-rules/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fsoft-assay-rules/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fsoft-assay-rules/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hubmapconsortium","download_url":"https://codeload.github.com/hubmapconsortium/soft-assay-rules/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hubmapconsortium%2Fsoft-assay-rules/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33519190,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T03:12:49.672Z","status":"ssl_error","status_checked_at":"2026-05-26T03:12:47.976Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ot2od030545"],"created_at":"2025-12-24T22:53:02.545Z","updated_at":"2026-05-26T11:35:20.884Z","avatar_url":"https://github.com/hubmapconsortium.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# soft-assay-rules\n\nRules for \"soft\" assay classification, and tools to generate and test them.\n\n## About\n\nBetween the time a dataset is submitted by a data provider and the time it is accessed\nby a potential user, many steps must occur.\n* The provided dataset or upload must be validated as syntactially correct.\n* The data must be \"ingested\", so that its type, location, and properties are known to the\nlarger system.\n* The dataset must be processed to make its content useful. For example, image stitching or\nRNA analysis may be required.  The steps required depend on the detailed structure of the\ndata.\n* The data and the results of the analysis must be displayed to the user.  This again\ndepends on the detailed structure of the data and that of the derived data produced by any\nanalysis.\n\nThe *Soft Assay Classifier Rule Engine* is one mechanism by which these relationships are\nmanaged.  A set of rules is applied to a detailed description of the original data format. Rules\nthat match the data are activated, yielding a summary of the properties of the data which can\nbe used by various downstream components to decide how to describe, display, or process the\ndata.  This repo contains the development history of the rule chain, plus tools to generate\nand test the rule chain.  When a new version of the rule chain is ready it is exported to\nanother repo to actually be installed in the rule engine.\n\nOnce installed, the rule chain can be triggered in response to a POST request containing\na metadata.tsv record in JSON form, or in response to a GET request including a uuid or\nHuBMAP/SenNet ID.  In the the former case the rule chain is passed only the given JSON\nwith an added pair with with key \"sample_is_human\" and a boolean value.  This POST\nmechanism is used when validating and ingesting new external data.\n\nWhen called with a GET request and uuid or ID, the entity JSON block for the given\nentity is fetched and several values are produced from that metadata if possible,\nincluding:\n* the ingest metadata, if present\n* the entity type, typically 'Dataset' or 'Publication'\n* information from the dag provenance list, or an empty list if it is unavailable\n* data_types information\n* the entity creation action\n* sample_is_human, as inferred from the entity provenance\n\nThese values are used to construct a JSON block which is passed to the rule chain.\n\n\n## Unit Tests\n\nAssuming the python environment specified in `requirements.txt` is in place, unit tests can be\nrun from the top level directory test.sh script:\n```\nbash ./test.sh\n```\n\nThe rule chain is tested, using examples stored in src/soft_assay_rules/test_examples and making\nuse of cached entity-api output where necessary (see below).  The function source_is_human() is also\ntested against cached entity-api output.\n\n## Running Other Test Routines\n\nThe `src/soft_assay_rules` directory contains two test routines, `rule_tester.py` and `local_rule_tester.py` .\nBoth use the samples in the `test_examples` subdirectory.  local_rule_tester.py uses cached values previously\nfetched from the appropriate services (see the section on cached REST endpoint responses below).\nThe first of these accesses an ingest-api URL to run tests against a remote running rule engine,\nand thus requires a live token.  The token is provided through the environment variable AUTH_TOK .  Since\nopertions in the context of SENNET differ slightly from those in the HUBMAP context, that context must\nalso be provided.  For example,\n```\nenv AUTH_TOK=\u003csome token\u003e APP_CTX=\u003cHUBMAP or SENNET\u003e python rule_tester.py test_examples/*\n```\ntests the remote rule engine against all the samples in the `test_examples` directory.  If the SENNET\ncontext is specified, examples taken from the HuBMAP side will fail, and vice versa.\n\n`local_rule_tester.py` instantiates a local rule engine and installs the rules found in the\ncurrent `testing_rule_chain.json` file.  It can be used to test new rules.  Because it cannot query\nentity-api when a uuid is specified, it must use cached results from the necessary queries.  (See\nthe section on cached REST endpoint responses below).  This test routine is invokes\nas follows:\n```\n$ python ./local_rule_tester.py test_examples/*\n```\n## Cached REST Endpoint Responses\n\nThe utility routines `cache_responses.py` and `cache_ubkg_responses.py`\ncan be used to prefetch and save the entity-api, ingest-api, and UBKG metadata JSON\nblocks associated with a given uuid, HuBMAP/SenNet ID, or UBKG code.  They are called as follows:\n```\nenv AUTH_TOK=\u003csome token\u003e APP_CTX=\u003cHUBMAP or SENNET\u003e python cache_responses.py uuid1 [uuid2 [uuid3...]]\nenv AUTH_TOK=\u003csome token\u003e APP_CTX=\u003cHUBMAP or SENNET\u003e python cache_ubkg_responses.py ubkg_code\n```\nThe first causes the entity-api JSON content for the uuid and the ingest-api/assayclassifier/metadata JSON\ncontent to be fetched and stored locally. The JSON returned by the deployed version of the rule chain\nis printed, for convenience in setting up new unit tests.  The second does the same for the UBKG response\nassociated with the given code.\n\nThus a new unit test corresponding to a\nspecific uuid in a specific APP_CTX can be set up by:\n* prefetching and saving the appropriate JSON using `cache_responses.py`\n* prefetching the ubkg_code used by that output using `cache_ubkg_responses.py`\n* creating a new test case using that uuid, or the ingest metadata for that uuid\n* saving the expected JSON output of the rule chain as the desired test output\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhubmapconsortium%2Fsoft-assay-rules","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhubmapconsortium%2Fsoft-assay-rules","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhubmapconsortium%2Fsoft-assay-rules/lists"}