{"id":24957246,"url":"https://github.com/geovation/catalyst-ons-geographies","last_synced_at":"2025-03-28T20:42:59.481Z","repository":{"id":271567070,"uuid":"897962247","full_name":"Geovation/catalyst-ons-geographies","owner":"Geovation","description":"Storing and querying ONS geographies within an easy to use library","archived":false,"fork":false,"pushed_at":"2025-01-27T20:57:17.000Z","size":198340,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-27T21:27:04.030Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Shell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Geovation.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-03T14:46:18.000Z","updated_at":"2025-01-27T20:57:21.000Z","dependencies_parsed_at":"2025-01-08T15:49:17.498Z","dependency_job_id":null,"html_url":"https://github.com/Geovation/catalyst-ons-geographies","commit_stats":null,"previous_names":["geovation/catalyst-ons-geographies"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Geovation%2Fcatalyst-ons-geographies","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Geovation%2Fcatalyst-ons-geographies/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Geovation%2Fcatalyst-ons-geographies/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Geovation%2Fcatalyst-ons-geographies/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Geovation","download_url":"https://codeload.github.com/Geovation/catalyst-ons-geographies/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246100437,"owners_count":20723467,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-02-03T06:55:09.953Z","updated_at":"2025-03-28T20:42:59.442Z","avatar_url":"https://github.com/Geovation.png","language":"Shell","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ONS Geography database\n\nThis is a repository for storing and querying Office for National Statistics geographies within the geoparquet file format and importing into a DuckDB database.\n\n## Introduction\n\n[DuckDB](https://duckdb.org/) is a fast in-process database system. It is designed for analytical workloads and can be used as a library in other applications. We are compiling a number of scripts to import ONS geographies into DuckDB, where they will then be available to embed in applications.\n\nThe outcome will be to produce and quickly update a duckdb database that can be used to query ONS geographies.\n\n## Setup\n\n- [Duck DB](https://duckdb.org/) CLI - installed via brew\n- [PlanetLab's GPQ](https://github.com/planetlabs/gpq) - installed via brew\n- [GDAL](https://gdal.org/) - installed via brew\n\n```bash\nbrew install duckdb\nbrew install planetlabs/gpq\nbrew install gdal\n```\n\n### Downloading and processing the data\n\nCensus boundaries and the ONS postcode directory are downloaded from the ONS Geoportal.\n\n```bash\n./download.sh\n```\n\nWhen the downloads are done the data is processed to create a number of geoparquet files.\n\n```bash\n./process.sh\n```\n\nThese are pregenerated as part of this repository, and can be found in the `data` directory.\n\n- `lsoas.parquet` - Lower Super Output Areas\n- `msoas.parquet` - Middle Super Output Areas\n- `ons_postcode_directory.parquet` - A selection of columns from the ONS Postcode Directory\n\n### Importing the data into DuckDB\n\nThe geoparquet files can be imported into DuckDB using a shell script.\n\n```bash\n./createpostcodesdb.sh\n```\n\nThis creates a file named `ons_postcodes.duckdb` which can be used to query the data.\n\n## Release\n\nWhen a new release is generated in GitHub for this repository, the duckdb database file will be added to the release page as a build item, so there is no need to run the above commands if you simply want the database file.\n\nSee the [releases page](https://github.com/Geovation/catalyst-ons-geographies/releases) for the latest release.\n\n## Usage\n\nWith duckdb installed the database can be launched:\n\n```\nduckdb ons_postcodes.duckdb\n```\n\nWhenever loading the database the following commands should be run to enable the geospatial functions:\n\n```\nLOAD spatial;\n```\n\n### Querying the database\n\nThe database can be queried using SQL.\n\n#### Find a postcode\n\n```sql\nSELECT * FROM vw_postcodes where replace(postcode, ' ', '') = 'BA151DS';\n```\n\nThe results of the above query would be:\n\n| Column Name | Value |\n| --- | --- |\n| postcode | BA15 1DS |\n| date_of_termination | |\n| county_code | E99999999 |\n| county_name | (pseudo) England (UA/MD/LB) |\n| county_electoral_division_code | E99999999\n| county_electoral_division_name | | \n| local_authority_district_code | E06000054 |\n| local_authority_district_name | Wiltshire |\n| ward_code | E05013407 |\n| ward_name | Bradford-on-Avon South |\n| easting | 382678 |\n| northing | 160818 |\n| country_code | E92000001 |\n| country_name | England |\n| region_code | E12000009 |\n| region_name | South West |\n| westminster_parliamentary_constituency_code | E14001356 |\n| westminster_parliamentary_constituency_name | Melksham and Devizes |\n| output_area_11_code | E00163467 |\n| lower_super_output_area_11_code | E01032050 |\n| middle_super_output_area_11_code | E02006682 |\n| built_up_area_24_code | E63012462 |\n| built_up_area_name | Bradford-on-Avon |\n| rural_urban_11_code | D1 |\n| rural_urban_11_name | (England/Wales) Rural town and fringe |\n| index_multiple_deprivation_rank | 27325 |\n| output_area_21_code | E00163467 |\n| lower_super_output_area_21_code | E01034532 |\n| middle_super_output_area_21_code | E02006682 |\n| longitude | -2.250094 |\n| latitude | 51.346176 |\n| geometry | POINT (-2.250094 51.346176) |\n\n#### Find a postcode by point\n\nIt is also possible to reverse geocode and find the postcode and associated ONS data for a given point. It's important to note that as the ONS postcode lookup is best fit, the results may not be 100% accurate for the given point. The following query uses the date_of_termination field to filter out postcodes that are no longer in use.\n\n```sql\nSELECT\n  st_distance(ST_Point(-2.250, 51.346), geometry) as distance,\n  *\nFROM vw_postcodes\nWHERE ST_Within(geometry, ST_Buffer(ST_Point(-2.250, 51.346), 0.01))\nAND date_of_termination IS NULL\nORDER BY distance ASC LIMIT 1;\n```\n\n#### Find multiple postcodes\n\nYou can also find multiple postcodes by using the `IN` clause.\n\n```sql\nSELECT * FROM vw_postcodes where replace(postcode, ' ', '') IN ('BA151DS', 'BA151DT');\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeovation%2Fcatalyst-ons-geographies","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgeovation%2Fcatalyst-ons-geographies","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgeovation%2Fcatalyst-ons-geographies/lists"}