{"id":27944712,"url":"https://github.com/udst/orca_test","last_synced_at":"2026-03-12T00:33:39.694Z","repository":{"id":57449522,"uuid":"63721527","full_name":"UDST/orca_test","owner":"UDST","description":"Data assertions for the Orca task orchestrator","archived":false,"fork":false,"pushed_at":"2020-05-11T17:56:58.000Z","size":52,"stargazers_count":0,"open_issues_count":15,"forks_count":3,"subscribers_count":10,"default_branch":"master","last_synced_at":"2025-10-26T20:57:20.690Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/UDST.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-07-19T19:19:25.000Z","updated_at":"2019-02-03T03:19:30.000Z","dependencies_parsed_at":"2022-09-26T17:31:18.825Z","dependency_job_id":null,"html_url":"https://github.com/UDST/orca_test","commit_stats":null,"previous_names":["urbansim/orca_test"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/UDST/orca_test","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UDST%2Forca_test","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UDST%2Forca_test/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UDST%2Forca_test/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UDST%2Forca_test/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/UDST","download_url":"https://codeload.github.com/UDST/orca_test/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UDST%2Forca_test/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":27717939,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-12-14T02:00:11.348Z","response_time":56,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-05-07T12:54:08.278Z","updated_at":"2025-12-14T05:07:10.549Z","avatar_url":"https://github.com/UDST.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"Orca_test\n=========\n\n[![Build Status](https://travis-ci.org/UDST/orca_test.svg?branch=master)](https://travis-ci.org/UDST/orca_test)\n\nThis is a library of assertions about the characteristics of tables, columns, and injectables that are registered in [Orca](https://github.com/udst/orca). \n\nThe motivation is that [UrbanSim](https://github.com/udst/urbansim) model code expects particular tables and columns to be in place, and can fail unpredictably when data is not as expected (missing columns, NaNs, negative prices, log-of-zero). These failures are rare, but hard to debug, and can happen at any time because data is modified as models run. \n\nOrca_test assertions can be included in model steps or used as part of the data preparation pipeline. The goal for this library is for it to be useful (1) as a model development aid, (2) for exception handling as simulations run, and (3) for documenting the data specs required by different UrbanSim templates. \n\n\n## Installation\n\nClone this repo and run `python setup.py develop`. Won't be of much use without [Orca](https://github.com/udst/orca) and some project that's using it for simulation orchestration. \n\n\n## Usage\n\nYou can either make assertions directly by calling individual orca_test functions, or assert a full set of characteristics at once. These characteristics are expressed as nested python classes (similar to sqlalchemy), and in the future will have an equivalent YAML syntax.\n\nIf an assertion passes, nothing happens. If it fails, an `OrcaAssertionError` is raised with a detailed message. Orca_test is written to be as computationally efficient as possible, and the main cost will be the generation of tables or columns that have not yet been cached. \n\nAssertions are chained as necessary: for example, asserting a column's minimum value will automatically assert that it is numeric, that missing values are coded in a particular way (`np.nan` by default), that the column can be generated without errors, and that it is registered with orca.\n\n### Example\n\n```python\nimport orca_test as ot\nfrom orca_test import OrcaSpec, TableSpec, ColumnSpec\n\n# Define a specification\no_spec = OrcaSpec('my_spec',\n\n\tTableSpec('buildings', \n\t\tColumnSpec('building_id', primary_key=True),\n\t\tColumnSpec('residential_price', min=0, missing=False)),\n\n\tTableSpec('households',\n\t\tColumnSpec('building_id', foreign_key='buildings.building_id', missing_val_coding=-1)),\n\t\n\tTableSpec('residential_units', registered=False),\n\t\n\tInjectableSpec('rate', greater_than=0, less_than=1))\n\n# Assert the specification\not.assert_orca_spec(o_spec)\n```\n\n### Working demos\n- [development_tests.py](https://github.com/urbansim/orca_test/blob/master/development_tests.py) in this repo\n- In the `ual-development` branch of `UAL/bayarea_urbansim`, the model steps include `orca_test` assertions to validate expected data characteristics ([ual.py](https://github.com/ual/bayarea_urbansim/blob/ual-development/baus/ual.py))\n\n\n## API Reference\n\nThere's fairly detailed documentation of individual functions in the [source code](https://github.com/urbansim/orca_test/blob/master/orca_test/orca_test.py).\n\n### Classes\n- `OrcaSpec( spec_name, optional TableSpecs, optional InjectableSpecs )`\n- `TableSpec( table_name, optional characteristics, optional ColumnSpecs )`\n- `ColumnSpec( column_name, optional characteristics )`\n- `InjectableSpec( injectable_name, optional characteristics )`\n- `OrcaAssertionError`\n\n### Asserting sets of characteristics\n- `assert_orca_spec( OrcaSpec )` -- asserts the entire nested spec\n- `assert_table_spec( TableSpec )`\n- `assert_column_spec( table_name, ColumnSpec )`\n- `assert_injectable_spec( InjectableSpec )`\n\n### Table assertions\n\n| Argument in TableSpec() | Equivalent low-level function |\n| ------------------ | -------------------------------- |\n| `registered = True` | `assert_table_is_registered( table_name )` |\n| `registered = False` | `assert_table_not_registered( table_name )` |\n| `can_be_generated = True` | `assert_table_can_be_generated( table_name )` |\n\n### Column assertions\n\n| Argument in ColumnSpec() | Equivalent low-level function |\n| ------------------ | --------------------------------- |\n| `registered = True` | `assert_column_is_registered( table_name, column_name )` |\n| `registered = False` | `assert_column_not_registered( table_name, column_name )`  |\n| `can_be_generated = True` | `assert_column_can_be_generated( table_name, column_name )` |\n| `numeric = True` | `assert_column_is_numeric( table_name, column_name )` |\n| `missing_val_coding = np.nan, 0, -1` | `assert_column_missing_value_coding( table_name, column_name, missing_val_coding )` |\n| `missing = False`| \u003ccode\u003eassert_column_no_missing_values( table_name, column_name, optional\u0026nbsp;missing_val_coding )\u003c/code\u003e |\n| \u003ccode\u003emax_portion_missing\u0026nbsp;=\u0026nbsp;portion\u003c/code\u003e | `assert_column_max_portion_missing( table_name, column_name, portion, optional missing_val_coding )` |\n| `primary_key = True` | `assert_column_is_primary_key( table_name, column_name )` |\n| `foreign_key = 'parent_table_name.parent_column_name'` | \u003ccode\u003eassert_column_is_foreign_key( table_name, column_name, parent_table_name, parent_column_name, optional\u0026nbsp;missing_val_coding )\u003c/code\u003e |\n| `max = value` | \u003ccode\u003eassert_column_max( table_name, column_name, maximum, optional\u0026nbsp;missing_val_coding)\u003c/code\u003e |\n| `min = value` | \u003ccode\u003eassert_column_min( table_name, column_name, minimum, optional\u0026nbsp;missing_val_coding )\u003c/code\u003e |\n| `is_unique = True` | \u003ccode\u003eassert_column_is_unique( table_name, column_name )\u003c/code\u003e |\n\n#### Notes\n\nProviding a `missing_val_coding` in a `ColumnSpec()` indicates that there should be no `np.nan` values in the column. Assertions involving a `min`, `max`, or `max_portion_missing` will take into account the `missing_val_coding` that's been provided.\n\nFor example, asserting that a column with values `[2, 3, 3, -1]` has `min = 0` will fail, but asserting that it has  \n`min = 0, missing_val_coding = -1` will pass.\n\n### Injectable assertions\n\n| Argument in InjectableSpec() | Equivalent low-level function |\n| ---------------------------- | ----------------------------- |\n| `registered = True` | `assert_injectable_is_registered( injectable_name )` |\n| `registered = False` | `assert_injectable_not_registered( injectable_name )`  |\n| `can_be_generated = True` | `assert_injectable_can_be_generated( injectable_name )` |\n| `numeric = True` | `assert_injectable_is_numeric( injectable_name )` |\n| `greater_than = value` | `assert_injectable_greater_than( injectable_name, value )` |\n| `less_than = value` | `assert_injectable_less_than( injectable_name, value )` |\n| `has_key = str` | `assert_injectable_has_key( injectable_name, str )` |\n\n\n## Development wish list\n- Add support for specs expressed in YAML\n- Write unit tests and set up in Travis\n\n\n## Sample YAML syntax (not yet implemented)\n\n```yaml\n- orca_spec:\n  - name: my_spec\n  \n  - table_spec:\n    - name: buildings\n    - column_spec:\n      - name: building_id\n  \t  - primary_key: True\n    - column_spec:\n  \t  - name: residential_price\n  \t  - min: 0\n  \t  - missing: False\n  \n  - table_spec:\n    - name: households\n    - column_spec:\n  \t  - name: building_id\n  \t  - foreign_key: buildings.building_id\n  \t  - missing_val_coding: -1\n  \n  - table_spec:\n    - name: residential_units\n    - registered: False\n    \n  - injectable_spec:\n    - name: rate\n    - greater_than: 0\n    - less_than: 1\n```\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fudst%2Forca_test","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fudst%2Forca_test","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fudst%2Forca_test/lists"}