Tools to extract cooling tower locations from aerial imagery
- Create WMTS index
- Run
prerequisites\polars_build_offet_index.py
- This creates a parquet file (
imagery_index.parquet
) that will be loaded into BigQuery by terraform
- Run
- Create the processing footprint
- This was done manually in ArcGIS Pro with the following steps:
- Buffer Utah Census Places 2020 by 800m
- Query Utah Buildings down to those larger than 464.5 sq m (5000 sq ft)
- Select by Location the queried buildings that are more than 800m from the census places buffer
- Export selected buildings to a new layer
- Buffer the new buildings layer by 800m
- Combine the buffered census places and buffered buildings into a single polygon layer
- Simplify the combined polygon layer to remove vertices
- Project the simplified polygon layer to WGS84 (EPSG: 4326)
- Export the projected polygon layer to shapefile (processing_footprint.shp)
- Convert the processing footprint from shapefile to CSV with geometries represented as GeoJSON using GDAL
- Use the process outlined in this Blog Post about loading geographic data into BigQuery
ogr2ogr -f csv -dialect sqlite -sql "select AsGeoJSON(geometry) AS geom, * from processing_footprint" footprints_in_4326.csv processing_footprint.shp
- The
footprints_in_4326.csv
file will be loaded into BigQuery by terraform
- This was done manually in ArcGIS Pro with the following steps:
-
Run the tower scout terraform
- this is a private github repository
-
Execute the two data transfers in order
-
Execute the two scheduled queries in order
-
Export
{PROJECT_ID}.indices.images_within_habitat
to GCSthere is a terraform item for this but I don't know how it will work since the data transfers are manual and the table may not exist
- GCS Location:
{PROJECT_ID}.images_within_habitat.csv
- Export format:
CVS
- Compression:
None
- GCS Location:
-
Using the cloud sql proxy
-
Create a cloud sql table for the task tracking
CREATE TABLE public.images_within_habitat ( row_num int NULL, col_num int NULL, processed bool NULL DEFAULT false ); CREATE UNIQUE INDEX idx_images_within_habitat_all ON public.images_within_habitat USING btree (row_num, col_num, processed);
- Create a cloud sql table for the results
CREATE TABLE public.cooling_tower_results ( envelope_x_min decimal NULL, envelope_y_min decimal NULL, envelope_x_max decimal NULL, envelope_y_max decimal NULL, confidence decimal NULL, object_class int NULL, object_name varchar NULL, centroid_x_px decimal NULL, centroid_y_px decimal NULL, centroid_x_3857 decimal NULL, centroid_y_3857 decimal NULL );
- Grant access to users
GRANT pg_read_all_data TO "cloud-run-sa@ut-dts-agrc-dhhs-towers-dev.iam"; GRANT pg_write_all_data TO "cloud-run-sa@ut-dts-agrc-dhhs-towers-dev.iam";
-
-
Import the CSV into the
images_within_habitat
table
-
Download the PyTorch model weights file and place in the
tower_scout
directory- Add URL
-
Clone YOLOv5 repository from parent directory
git clone https://github.com/ultralytics/yolov5
-
Create virtual environment from the parent directory with Python 3.10
python -m venv .env .env\Scripts\activate.bat pip install -r requirements.dev.txt
To work with the CLI,
- Create a python environment and install the
requirements.dev.txt
into that environment - Execute the CLI to see the commands and options available
python cool_cli.py
To test a small amount of data
- Set the number of tasks to 1
- Set the environment variables
SKIP
: int e.g. 1106600TAKE
: int e.g. 50JOB_NAME
: string e.g. alligator
To run a batch job
- Set the number of tasks to your desired value e.g. 10000
- Set the concurrency to your desired value e.g. 35
- Set the environment variables
JOB_NAME
: string e.g. alligatorJOB_SIZE
: int e.g. 50 (this value needs to be processable within the timeout)
Our metrics show that we can process 10 jobs a minute. The default cloud run timeout is 10 minutes.