Skip to content

Repository for creating satellite image embeddings that can be used for a range of tasks.

Notifications You must be signed in to change notification settings

barbarametzler/sat_embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

sat_emb

This is a repository to create image embeddings from a large satellite composite. The embeddings can be used for a variety of tasks such as improving existing models for classification, as well as unsupervised tasks.

Installation

Please install the conda environment called environment.yml. This will install all the necessary packages to run the code.

Dependencies

The embeddings code can be run with Pytorch + CUDA or regular CPU depending on access to computing power.

Data download

Please download the data from the following link: GoogleDrive (email me for access).

Usage

1. Data preprocessing: create tiles

Create .tif files from the input .vrt file. The input .vrt file should contain the satellite composite. The files are saved in the output_folder. The h3_shapes_path should point to the h3 grid file. The grid420_path should point to the grid file. The size refers to the dataset size - e.g. number of tiles (and helps you create a smaller subset more easily). The mask parameter should be set to True if the tiles should be masked with the h3 grid.

python create_tiles.py --size 9000000 --mask False --h3_shapes_path /data/vector/420_grid.parquet --grid420_path /data/vector/grid_complete.parquet --vrt_file /data/raster/all_GHS-composite-S2.vrt --folder /data/raster/england/unmasked/

Note: The whole England dataset includes 1,382,771 tiles. The whole dataset with clipped tiles is very large ~ about 700GB.

2. Create embeddings

Create embeddings by running the following command. The input folder should contain the tiles created in the previous step. The output file will contain the embeddings in a parquet file. The model weights path should point to the model weights file.

python create_embeddings_fpn_pool.py --input_folder /data/raster/england/unmasked/ --output_file /data/raster/england/unmasked/embeddings_england_pool_nomask.parquet --model_weights_path /data/model/satlas-model-v1-lowres.pth

About

Repository for creating satellite image embeddings that can be used for a range of tasks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published