Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

57 add documentation #71

Merged
merged 15 commits into from
May 6, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -156,4 +156,10 @@ dmypy.json
cython_debug/

# VSCode
.vscode
.vscode


# Demo data
docs/notebooks/bag_light_AMS_WGS84.gpkg
docs/notebooks/stm.zarr
vanlankveldthijs marked this conversation as resolved.
Show resolved Hide resolved
docs/notebooks/stm_ordered.zarr
4 changes: 3 additions & 1 deletion docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,6 @@ to see if someone already filed the same issue;
- [push](http://rogerdudler.github.io/git-guide/) your feature branch to (your fork of) the stmtools repository on GitHub;
- create the pull request, e.g. following [the instructions: creating a pull request](https://help.github.com/articles/creating-a-pull-request/).

In case you feel like you've made a valuable contribution, but you don't know how to write or run tests for it, or how to generate the documentation: don't let this discourage you from making the pull request; we can help you! Just go ahead and submit the pull request, but keep in mind that you might be asked to append additional commits to your pull request.
In case you feel like you've made a valuable contribution, but you don't know how to write or run tests for it, or how to generate the documentation: don't let this discourage you from making the pull request; we can help you! Just go ahead and submit the pull request, but keep in mind that you might be asked to append additional commits to your pull request.

In case you want to add documentation and you don't have mkdocs installed in your root environment, you can install it by calling ```pip install -e .[docs]```. You can then test your documentation by calling ```mkdocs serve```.
Binary file added docs/img/Four-level_Z.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Example Notebook\n",
"# Example Operations\n",
"\n",
"In this example notebook, we will demonstrate how to:\n",
"\n",
Expand Down Expand Up @@ -102,7 +102,7 @@
"source": [
"path_stm = Path('./stm.zarr')\n",
"stmat = xr.open_zarr(path_stm)\n",
"stmat = stmat.chunk({\"space\": 10000, \"time\":-1}) # Chunk 10000 space, no chunk in time\n",
"stmat = stmat.chunk({\"space\": 10000, \"time\": -1}) # Chunk 10000 space, no chunk in time\n",
"\n",
"print(stmat)"
]
Expand Down
570 changes: 570 additions & 0 deletions docs/notebooks/demo_order_stm.ipynb

Large diffs are not rendered by default.

610 changes: 610 additions & 0 deletions docs/notebooks/demo_order_stm_tmp.ipynb

Large diffs are not rendered by default.

8 changes: 4 additions & 4 deletions docs/operations.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,13 +33,13 @@ A subset of an STM can be obtained based on 1) thresholding on an attribute, or
For example, select entries with `pnt_enscoh` higher than 0.7:

```python
stmat_subset = stmat.stm.subset(method="threshold", var="pnt_enscoh", threshold='>0.7')
stmat_subset = stmat.stm.subset(method='threshold', var='pnt_enscoh', threshold='>0.7')
```

This is equivalent to Xarray filtering:

```python
mask = stmat["pnt_enscoh"] > 0.7
mask = stmat['pnt_enscoh'] > 0.7
mask = mask.compute()
stmat_subset = stmat.where(mask, drop=True)
```
Expand Down Expand Up @@ -68,7 +68,7 @@ Use `regulate_dims` to add a missing `space` or `time` dimension.
```python
# An STM witout time dimension
nspace = 10
stm_only_space = xr.Dataset(data_vars=dict(data=(["space"], np.arange(nspace))))
stm_only_space = xr.Dataset(data_vars=dict(data=(['space'], np.arange(nspace))))

stm_only_space
```
Expand Down Expand Up @@ -98,6 +98,6 @@ Data variables:
Use `register_metadata` to assign metadata to an STM by a Python dictionary.

```python
metadata_normal = dict(techniqueId="ID0001", datasetId="ID_datasetID", crs=4326)
metadata_normal = dict(techniqueId='ID0001', datasetId='ID_datasetID', crs=4326)
stmat_with_metadata = stmat.stm.register_metadata(metadata_normal)
```
64 changes: 64 additions & 0 deletions docs/order.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# Ordering an STM

STMTools supports (re)ordering the elements in an STM.


## Why element order is important

The data elements in an STM can be ordered according to a wide variety aspects, such as time, horizontal before vertical, or classification. The choice of order can have a significant impact on the performance of operations applied to the data.

An important consideration is that operations often don't have to be applied to the complete dataset. An STM is always loaded per chunk, so it can be beneficial to collect elements in a chunk that should be processed together.

Our operations often prefer elements to be ordered by spatial coherency. For example, to enrich or subset an STM, the element positions will have to be compared to polygons. Ideally, we only want to process elements that are near the query polygon.


## How are elements reordered

When applying spatial ordering, we order the elements according to their Morton code. A Morton code is a single integer representation of a higher dimensional coordinate. The following image shows a few sequences of Morton codes as a polyline for a few small sets of 2D points.

![Morton orders](img/Four-level_Z.png)

The translation to Morton code can be direct when the point coordinates are integers, such as pixel coordinates. Floating point coordinates must be scaled and truncated to integer values first. The choice of scale factor determines the resolution of the Morton code.

Note that for a detailed dataset, a close group of points could be assigned the same Morton code, depending on the choice of scale factor. These points will be grouped together after ordering, but their internal order will not be strictly determined. In other words, we cannot detemine beforehand what their order will be, but they will not be separated by points with a different Morton code.


## Ordering existing stmat

Reordering an existing STM is very straightforward.
If the coordinates are integer values, such as the pixel coordinates `X` and `Y`, the STM can be reordered as follows:

```python
stmat_ordered = stmat_xy.stm.reorder(xlabel='X', ylabel='Y')
```

If the coordinates are floating point coordinates, such as longitude and latitude, you must choose a scale factor for each coordinate such that points that are at least a unit distance apart in either direction can be differentiated by their Morton code. For example, a scale factor of ```1.1*10^5``` on the latitude coordinate means that points that are at least 1 meter apart will be assigned a different Morton code.

```python
stmat_ordered = stmat_lonlat.stm.reorder(xlabel='lon', ylabel='lat', xscale=1.5*(10**5), yscale=1.7*(10**5))
```

Reordering the STM is actually a two-step process: computing the Morton codes and sorting the STM. You can also apply these steps separately:

```python
stmat_ordered = stmat_ar.stm.get_order(xlabel='azimuth', ylabel='range', xscale=15, yscale=17)
stmat_ordered = stmat_ordered.sortby(stmat_ordered.order)
```


## Ordering new stmat

Reading and writing data to disk can cost a significant amount of time. It is usually beneficial not to overwrite existing data unless necessary. If you intend to apply spatial ordering to your STM, we advise doing so before writing your data to disk.

The following example selects some points of a sarxarray, reorders then, and only then writes them to disk:

```python
stmat = stack.slcstack.point_selection(threshold=0.25, method='amplitude_dispersion')
stmat = stmat.stm.reorder(xlabel='azimuth', ylabel='range', xscale=15, yscale=17)
stmat.to_zarr('stm.zarr')
```


## Effect on processing time

The example notebooks contain an example of the effect or ordering the STM on processing time: [Example Ordering notebook](./notebooks/demo_order_stm.ipynb)
5 changes: 4 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,10 @@ nav:
- Usage:
- Initiate an STM: stm_init.md
- Operations on STM: operations.md
- Notebook page: notebooks/demo_stm.ipynb
- Ordering an STM: order.md
- Example Notebooks:
- Example Operations: notebooks/demo_operations_stm.ipynb
- Example Ordering: notebooks/demo_order_stm.ipynb
- Contributing guide: CONTRIBUTING.md
- Change Log: CHANGELOG.md

Expand Down
8 changes: 8 additions & 0 deletions stmtools/stm.py
Original file line number Diff line number Diff line change
Expand Up @@ -397,7 +397,15 @@ def reorder(self, xlabel="azimuth", ylabel="range", xscale=1.0, yscale=1.0):
Scaling multiplier to the y coordinates before truncating them to integer values.
"""
self._obj = self.get_order(xlabel, ylabel, xscale, yscale)

# Sorting may split up the chunks, which may interfere with later operations, so we immediately restore the chunk sizes.
chunks = {
"space": self._obj.chunksizes["space"][0],
"time": self._obj.chunksizes["time"][0],
}
self._obj = self._obj.sortby(self._obj.order)
self._obj = self._obj.chunk(chunks)

return self._obj

@property
Expand Down
Loading