{"id":44079440,"url":"https://github.com/pacificclimate/modelmeta","last_synced_at":"2026-02-08T08:34:37.612Z","repository":{"id":16862397,"uuid":"19622573","full_name":"pacificclimate/modelmeta","owner":"pacificclimate","description":"An ORM representation of the model metadata database","archived":false,"fork":false,"pushed_at":"2025-06-11T22:41:39.000Z","size":4056,"stargazers_count":1,"open_issues_count":41,"forks_count":0,"subscribers_count":6,"default_branch":"master","last_synced_at":"2025-06-11T22:46:13.025Z","etag":null,"topics":["actions","make","pipenv","pypi"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pacificclimate.png","metadata":{"files":{"readme":"README.md","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2014-05-09T19:18:14.000Z","updated_at":"2025-06-11T22:41:41.000Z","dependencies_parsed_at":"2025-04-30T23:22:47.596Z","dependency_job_id":"db1bd2d2-20bd-4525-a28a-bc5becf32d54","html_url":"https://github.com/pacificclimate/modelmeta","commit_stats":null,"previous_names":[],"tags_count":14,"template":false,"template_full_name":null,"purl":"pkg:github/pacificclimate/modelmeta","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pacificclimate%2Fmodelmeta","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pacificclimate%2Fmodelmeta/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pacificclimate%2Fmodelmeta/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pacificclimate%2Fmodelmeta/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pacificclimate","download_url":"https://codeload.github.com/pacificclimate/modelmeta/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pacificclimate%2Fmodelmeta/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29225478,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-08T06:05:31.539Z","status":"ssl_error","status_checked_at":"2026-02-08T05:58:33.853Z","response_time":57,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["actions","make","pipenv","pypi"],"created_at":"2026-02-08T08:34:36.300Z","updated_at":"2026-02-08T08:34:37.606Z","avatar_url":"https://github.com/pacificclimate.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# modelmeta\n\n[![image](https://github.com/pacificclimate/modelmeta/workflows/Python%20CI/badge.svg)](https://github.com/pacificclimate/modelmeta)\n\n[![image](https://github.com/pacificclimate/modelmeta/workflows/Pypi%20Publishing/badge.svg)](https://github.com/pacificclimate/modelmeta)\n\n[![Code Climate](https://codeclimate.com/github/pacificclimate/modelmeta/badges/gpa.svg)](https://codeclimate.com/github/pacificclimate/modelmeta)\n\n## Overview\n\n`modelmeta` is a Python package that provides an [Object Relational\nMapping (ORM)](http://en.wikipedia.org/wiki/Object-relational_mapping)\nlayer for accessing the [Pacific Climate Impacts Consortium\n(PCIC)](http://www.pacificclimate.org/)\\'s database of [coverage\ndata](http://en.wikipedia.org/wiki/Coverage_data) metadata. The package\nprovides model classes for each of the tables in the database.\n\nWith this package, one can recreate the database schema in\n[PostgreSQL](http://www.postgresql.org) or\n[SQLite](http://www.sqlite.org) and/or use the package as an object\nmapper for programmatic database access.\n\n`modelmeta` uses [SQLAlchemy](http://www.sqlalchemy.org) to provide the\nORM layer, and [Alembic](http://alembic.zzzcomputing.com/en/latest/) to\nmanage database creation and migration (see section below).\n\nThe intent of the database itself is to separate the small, inexpensive,\nstructured metadata and attribute information (stored in the database)\nfrom the expensive-to-access bulk spatiotemporal data (stored on disk in\nmultidimensional files). It provides an efficiently searchable index of\nthe bulk data files, and separates storage from indexing.\n\n## Installation\n\nInstallation of this package, including system dependences, is automated through either `make` or `poethepoet` for Ubuntu 24.04:\n\n    $ make #option 1\n    $ poe setup #option 2\n\nIf you wish to install `modelmeta` manually, follow the steps below.\n\n1.  Clone the repository:\n\n        $ git clone https://github.com/pacificclimate/modelmeta\n\n2.  Create and activate a virtual environment:\n\n        $ cd modelmeta\n        $ python3 -m venv venv\n        $ source venv/bin/activate\n        $ pip install poetry\n        $ poetry install # --with=test for development and testing\n\nScripts to populate a PCIC modelmeta database\n===========================================\n\nThis repository contains two convenient scripts that add data files to\nan existing modelmeta database so that our websites can access data from\nthem. They are installed when the package is installed.\n\n### Indexing new files with index_netcdf\n\n`index_netcdf` adds one or more netCDF climate data files to a PCIC\nmodelmeta-format database:\n\n    index_netcdf -d postgresql://username:password@monsoon.pcic.uvic.ca/database /path/to/files/*.nc\n\nUsernames and passwords can be found in Team Password Manager. To add\nfiles to the data portal, use database `pcic_meta`; to add files to PCEX\nor Plan2adapt, use database `ce_meta_12f290b63791`.\n\nIn order to determine the metadata of the file, the `index_metadata`\nscript scans its netCDF attributes. If the file does not have all the\n[required\nattributes](https://pcic.uvic.ca/confluence/display/CSG/PCIC+metadata+standard+for+downscaled+data+and+hydrology+modelling+data)\nspecified, the `index_metadata` script will be unable to proceed. You\ncan look at a file\\'s attributes with the command:\n\n    ncdump -h file.nc\n\nand update attributes using the `update_metadata` script in the\n[climate-explorer-data-prep](https://github.com/pacificclimate/climate-explorer-data-prep)\nrespository. If you update file attributes, log your update YAML and a\nlist of which files you used with in in the\n[data-prep-actions](https://github.com/pacificclimate/data-prep-actions)\nrepository, in case you need to reprocess or check the files later.\n\n### Making files accessible to PCIC projects with associate_ensemble\n\nOnce files have been indexed into the database, they need to be added to\nindividual ensembles; each ensemble is associated with a particular\nproject or website and contains all data files needed to support the\nfunctioning of that component. In most cases, a file will be added to\nmore than one ensemble:\n\n    associate_ensemble -n ensemble_name -v 1 -d postgresql://username:password@db.pcic.uvic.ca/database /path/to/files/*.nc\n\n**Available ensembles, or where should I put this data anyway?**\n\nMost ensembles represent groupings of related files that users can\ninteract with (view maps, download data, create graphs, etc) using a\nspecific PCIC tool. Plan2adapt, the data portal, SCIP, and PCEX all use\nensembles in this way.\n\nPlan2adapt uses a single ensemble which represents a list of all the\nfiles a user can view. The name of this ensemble is set when plan2adapt\nis deployed, as the environment variable `REACT_APP_ENSEMBLE_NAME`. You\ncan see the environment variables for a docker container running\nplan2adapt with `docker exec container_name env`.\n\nThe data portal uses a separate ensemble for each portal, which\nrepresents a list of the data files a user can download from that\nportal. Each portal\\'s ensemble is hard-coded in that portal\\'s\ndefinition file, in\n[pdp/portals](https://github.com/pacificclimate/pdp/tree/master/pdp/portals)\n.\n\nPCEX is flexible about which ensembles it uses. A PCEX URL encodes both\na UI and an ensemble which specifies which data is to be viewed with\nthat UI. In theory you can look at any ensemble with any UI, but in\npractice, UIs make assumptions about the type of data available and most\ncombinations won\\'t work. In most cases, users access the various PCEX\nUIs pages via the [link bar at the top of the\npage](https://github.com/pacificclimate/climate-explorer-frontend/blob/master/src/components/DataTool.js).\nThe `navSubpath` variable has the UI before the slash and the ensemble\nafter it. PCEX UIs that display hydrological data for a watershed also\nhave an additional ensemble that contains files that describe the\ngeography of the watershed; this data cannot be directly viewed by the\nuser but is required for some calculations. A list of these geographic\nensembles can be\n[found](https://github.com/pacificclimate/climate-explorer-frontend/blob/master/src/data-services/ce-backend.js)\nin `getWatershedGeographyName()`.\n\nSCIP is currently hardcoded to use an ensemble named\n\\\"scip_fraser_bccoast\\\".\n\nThere are also two special ensembles used by PCIC internal tools, not\nweb portals.\n\n-   The `all_files` ensemble in the `ce_meta_12f290b63791` contains\n    every file in the database, with the exception of time-invariant\n    files. It is used with various scripts that\n    [test](https://github.com/pacificclimate/data-prep-actions/blob/master/actions/test-ncwms-instance/DESCRIPTION.md)\n    new functionality across all files.\n-   The `p2a_rules` ensemble in the `ce_meta_12f290b63791` database\n    contains all the information needed by plan2adapt\\'s rules engine;\n    it is used by the\n    [scripts](https://github.com/pacificclimate/data-prep-actions/blob/master/actions/precalculate-p2a-regions/DESCRIPTION.md)\n    which pre-generate rules engine results for plan2adapt, which are\n    too slow to process in real time.\n\n### Deleting files from the databases\n\nUnfortunately, we don\\'t currently have a script that can delete files\nfrom the databases. If you accidentally index a file with bad metadata\nand need to get rid of it, at present the only way is to log on to the\ndatabase directly with `psql` or `pgadmin`.\n\n## What is climate coverage data?\n\nClimate coverage data (or \\\"raster data\\\" or \\\"spatiotemporal data\\\")\nconsist of large data fields, typically over two or three dimensions in\nspace plus a time dimension. Depending on the resolution in each axis,\nthe data can typically be quite large in size. Typically there are\nseveral-to-many output quantities (e.g. temperature, precipiation, wind\nspeed/direction) and often there can be multiple scenarios, multiple\nmodel implementations, and multiple runs of each model further\nexacerbating the size of the data.\n\n## Managing database migrations\n\n### Introduction\n\nModifications to `modelmeta`\\'s schema definition are now managed using\n[Alembic](https://alembic.sqlalchemy.org/en/latest/), a database migration tool based on SQLAlchemy.\n\nIn short, Alembic supports and disciplines two processes of database\nschema change:\n\n-   Creation of database migration scripts (Python programs) that modify\n    the schema of a database.\n-   Application of migrations to specific database instances.\n    -   In particular, Alembic can be used to *create* a new instance of\n        a `modelmeta` database by migrating an empty database to the\n        current state. This is described in detail below.\n\nFor more information, see the [Alembic\ntutorial](http://alembic.zzzcomputing.com/en/latest/tutorial.html).\n\n### History\n\nThe existing instance of a `modelmeta` database (`monsoon/pcic_meta`)\nwas created prior to the adoption of Alembic, and therefore the timeline\nfor Alembic database migrations is slightly confusing.\n\nTimeline:\n\n-   *the distant past*: `pcic_meta` is created by mysterious primeval\n    processes.\n-   *somewhat later*: `modelmeta` is defined using SQLAlchemy, mapping\n    most (but not all) features of the existing `pcic_meta` database\n    into an ORM.\n-   2017-07-18:\n    -   Alembic is introduced.\n    -   Alembic is used to create migration `614911daf883` that adds\n        item `seasonal` to `timescale` Enum.\n-   2017-08-01:\n    -   The SQLAlchemy ORM is updated to reflect all features of the\n        `pcic_meta` database. This mainly involves adding some missing\n        indexes and constraints.\n    -   Alembic is used to create a logically-previous migration\n        `7847aa3c1b39` that creates the initial database schema from an\n        empty database.\n    -   The add-seasonal migration is modified to logically follow the\n        initial-create migration.\n\n#### Creating a new database\n\n##### For a Postgres database\n\nA Postgres database is somewhat more elaborate to set up, but it is also\nthe foundation of a production database, not least because we use\nPostGIS.\n\nInstructions:\n\n1.  Choose a name for your new database/schema, e.g., `ce_meta`.\n\n2.  On the server of your choice (e.g., `monsoon`):\n\n    **Note**: These operations must be performed with high-level\n    permissions. See the System Administrator to have these done or\n    obtain permissions.\n\n    For a record of such a creation, see [Redmine Issue\n    696](https://redmine.pacificclimate.org/issues/696). Permission\n    setup was more complicated than anticipated.\n\n    a.  Create a new database with the chosen name, e.g., `ce_meta`.\n\n    b.  Within that database, create a new schema with the chosen name,\n        e.g., `ce_meta`.\n\n    c.  Create new users, with the following permissions:\n\n        -   `ce_meta` (database owner): full permissions for table\n            creation and read-write permissions in schemas `ce_meta` and\n            `public`\n        -   `ce_meta_rw` (database writer): read-write permissions in\n            schemas `ce_meta` and `public`\n        -   `ce_meta_ro` (database reader): read-only permissions in\n            schemas `ce_meta` and `public`\n\n        and for each of them\n\n        -   `search_path = ce_meta,public`\n\n    d.  [Enable PostGIS in the new\n        database](http://postgis.net/install/).\n\n        -   `CREATE EXTENSION postgis;`\n        -   This creates the table `spatial_ref_sys` in schema `public`.\n            Check that.\n\n3.  Add a DSN for your new database, including the appropriate user\n    name, to `alembic.ini`. For example:\n\n        [prod_ce_meta]\n        sqlalchemy.url = postgresql://ce_meta@monsoon.pcic.uvic.ca/ce_meta\n\n4.  Create your new database with Alembic by ugrading the empty database\n    to `head`:\n\n        alembic -x db=prod_ce_meta upgrade head\n\n5.  Have a beer.\n\n##### For a SQLite database\n\nA SQLite database is very simple to set up, but is normally used only\nfor testing.\n\n1.  Add a DSN for your new database to `alembic.ini`. This database need\n    not exist yet (although the path does). For example:\n\n        [my_test_database]\n        sqlalchemy.url = sqlite:///path/to/test.sqlite\n\n2.  Create your new database with Alembic by ugrading the non-existent\n    database to `head`:\n\n        alembic -x db=my_test_database upgrade head\n\n3.  Have a beer. Or at least a soda.\n\n### Updating the existing `pcic_meta` database\n\n**DEPRECATED**: [Decision taken not to modify\npcic_meta](https://pcic.uvic.ca/confluence/display/CSG/pcic_meta%3A+Current+contents+and+update+plan+2017-Jul)\nThis content is retained in case that decision is revised in future.\n\nThis section is only of interest to PCIC.\n\n#### Initialization\n\nStatus: NOT DONE\n\nThe following things need to be done ONCE in order to bring `pcic_meta`\nunder management by Alembic.\n\n1.  The table `pcic_meta.alembic_version` has already been created in\n    `pcic_meta` by earlier operations. Its content is currently `null`.\n2.  Place the value `7847aa3c1b39` in the single row and column of table\n    `pcic_meta.alembic_version` in `pcic_meta`.\n    -   This fakes the migration from an empty database to its nominal\n        initial state (before add-seasonal migration).\n\n#### Ongoing migrations\n\nOnce the initialization steps have been completed, ongoing migrations\nare simple and standard:\n\n1.  Apply later migrations: `alembic -x db=prod_pcic_meta upgrade head`\n    -   At the time of this writing (2017-08-01), that would be\n        migration `614911daf883`.\n\n### Creating a new database migration\n\nFirst create a new revision with this command:\n\n```\nalembic revision -m \"\u003cbrief description of changes to the database\u003e\"\n```\n\nAlembic will create a new file in `alembic/versions` with stub `upgrade()` and `downgrade()`\nfunctions. Fill in the upgrade function to make the change you want the migration to create\n(IE, drop a table, recreate it with different column names) and the downgrade function to return\nto the current database configuration from the new one (IE, drop a table, recreate it with\nthe old column names). The `alembic.op` object has various useful database manipulation functions\nfor this.\n\nYou can migrate a database to the new revision with the `alembic upgrade head` command discussed\nearlier.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpacificclimate%2Fmodelmeta","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpacificclimate%2Fmodelmeta","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpacificclimate%2Fmodelmeta/lists"}