https://github.com/agrc/dhhs-cooling-towers
Tools to extract cooling tower locations from aerial imagery
https://github.com/agrc/dhhs-cooling-towers
aerial-imagery artificial-intelligence computer-vision cooling-towers gis government government-app image-processing machine-learning utah
Last synced: 8 months ago
JSON representation
Tools to extract cooling tower locations from aerial imagery
- Host: GitHub
- URL: https://github.com/agrc/dhhs-cooling-towers
- Owner: agrc
- License: mit
- Created: 2023-03-24T00:59:12.000Z (over 2 years ago)
- Default Branch: dev
- Last Pushed: 2025-01-07T01:27:16.000Z (11 months ago)
- Last Synced: 2025-01-18T06:26:30.237Z (10 months ago)
- Topics: aerial-imagery, artificial-intelligence, computer-vision, cooling-towers, gis, government, government-app, image-processing, machine-learning, utah
- Language: Python
- Homepage:
- Size: 339 KB
- Stars: 0
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# dhhs-cooling-towers
Tools to extract cooling tower locations from aerial imagery
## Prerequisites
1. Create WMTS index
- Run `prerequisites\polars_build_offet_index.py`
- This creates a parquet file (`imagery_index.parquet`) that will be loaded into BigQuery by terraform
1. Create the processing footprint
- This was done manually in ArcGIS Pro with the following steps:
1. Buffer [Utah Census Places 2020](https://opendata.gis.utah.gov/datasets/utah-census-places-2020/explore) by 800m
1. Query [Utah Buildings](https://opendata.gis.utah.gov/datasets/utah-buildings/explore) down to those larger than 464.5 sq m (5000 sq ft)
1. Select by Location the queried buildings that are more than 800m from the census places buffer
1. Export selected buildings to a new layer
1. Buffer the new buildings layer by 800m
1. Combine the buffered census places and buffered buildings into a single polygon layer
1. Simplify the combined polygon layer to remove vertices
1. Project the simplified polygon layer to WGS84 (EPSG: 4326)
1. Export the projected polygon layer to shapefile (processing_footprint.shp)
1. Convert the processing footprint from shapefile to CSV with geometries represented as GeoJSON using GDAL
- Use the process outlined in this [Blog Post](https://medium.com/google-cloud/how-to-load-geographic-data-like-zipcode-boundaries-into-bigquery-25e4be4391c8) about loading geographic data into BigQuery
- `ogr2ogr -f csv -dialect sqlite -sql "select AsGeoJSON(geometry) AS geom, * from processing_footprint" footprints_in_4326.csv processing_footprint.shp`
1. The `footprints_in_4326.csv` file will be loaded into BigQuery by terraform
## Data preparation
1. Run the tower scout terraform
- this is a private github repository
1. Execute the two data transfers in order
1. Execute the two scheduled queries in order
1. Export `{PROJECT_ID}.indices.images_within_habitat` to GCS
_there is a terraform item for this but I don't know how it will work since the data transfers are manual and the table may not exist_
- GCS Location: `{PROJECT_ID}.images_within_habitat.csv`
- Export format: `CVS`
- Compression: `None`
1. Using the cloud sql proxy
1. Create a cloud sql table for the task tracking
```sql
CREATE TABLE public.images_within_habitat (
row_num int NULL,
col_num int NULL,
processed bool NULL DEFAULT false
);
CREATE UNIQUE INDEX idx_images_within_habitat_all ON public.images_within_habitat USING btree (row_num, col_num, processed);
```
1. Create a cloud sql table for the results
```sql
CREATE TABLE public.cooling_tower_results (
envelope_x_min decimal NULL,
envelope_y_min decimal NULL,
envelope_x_max decimal NULL,
envelope_y_max decimal NULL,
confidence decimal NULL,
object_class int NULL,
object_name varchar NULL,
centroid_x_px decimal NULL,
centroid_y_px decimal NULL,
centroid_x_3857 decimal NULL,
centroid_y_3857 decimal NULL
);
```
1. Grant access to users
```sql
GRANT pg_read_all_data TO "cloud-run-sa@ut-dts-agrc-dhhs-towers-dev.iam";
GRANT pg_write_all_data TO "cloud-run-sa@ut-dts-agrc-dhhs-towers-dev.iam";
```
1. Import the CSV into the `images_within_habitat` table
## To work with the CLI locally
1. Download the PyTorch model weights file and place in the `tower_scout` directory
- Add URL
1. Clone YOLOv5 repository from parent directory
```sh
git clone https://github.com/ultralytics/yolov5
```
1. Create virtual environment from the parent directory with Python 3.10
```sh
python -m venv .env
.env\Scripts\activate.bat
pip install -r requirements.dev.txt
```
## CLI
To work with the CLI,
1. Create a python environment and install the `requirements.dev.txt` into that environment
1. Execute the CLI to see the commands and options available
- `python cool_cli.py`
## Testing
## Cloud Run Job
To test a small amount of data
1. Set the number of tasks to 1
1. Set the environment variables
- `SKIP`: int e.g. 1106600
- `TAKE`: int e.g. 50
- `JOB_NAME`: string e.g. alligator
To run a batch job
1. Set the number of tasks to your desired value e.g. 10000
1. Set the concurrency to your desired value e.g. 35
1. Set the environment variables
- `JOB_NAME`: string e.g. alligator
- `JOB_SIZE`: int e.g. 50 (this value needs to be processable within the timeout)
Our metrics show that we can process 10 jobs a minute. The default cloud run timeout is 10 minutes.
## References for Identifying Cooling Towers in Aerial Imagery
- [CDC Procedures for Identifying Cooling Towers](https://www.cdc.gov/legionella/health-depts/environmental-inv-resources/id-cooling-towers.html)
- [CDC Photos of Cooling Towers](https://www.cdc.gov/legionella/health-depts/environmental-inv-resources/cooling-tower-images.html)
- [California Department of Public Health: Cooling Tower Identification 101](https://www.cdph.ca.gov/Programs/CID/DCDC/CDPH%20Document%20Library/CoolingTowerIDGuideProtocol.pdf)
- [California Department of Public Health: Cooling Tower Identification Guide Supplement](https://www.cdph.ca.gov/Programs/CID/DCDC/CDPH%20Document%20Library/CTIDGuideSupplement.pdf)