Skip to content

Commit

Permalink
Merge pull request #520 from mjohns-databricks/mjohns-0.4.0-docs-4
Browse files Browse the repository at this point in the history
Mjohns 0.4.0 docs 4
  • Loading branch information
Milos Colic authored Jan 25, 2024
2 parents 2a96291 + f308a49 commit 9e01d87
Show file tree
Hide file tree
Showing 12 changed files with 185 additions and 150 deletions.
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,14 +147,14 @@ __Note: Mosaic 0.4.x SQL bindings for DBR 13 not yet available in Unity Catalog

Here are some example notebooks, check the language links for latest [[Python](/notebooks/examples/python/) | [Scala](/notebooks/examples/scala/) | [SQL](/notebooks/examples/sql/) | [R](/notebooks/examples/R/)]:

| Example | Description | Links |
| --- | --- | --- |
| __Quick Start__ | Example of performing spatial point-in-polygon joins on the NYC Taxi dataset | [python](/notebooks/examples/python/QuickstartNotebook.ipynb), [scala](notebooks/examples/scala/QuickstartNotebook.ipynb), [R](notebooks/examples/R/QuickstartNotebook.r), [SQL](notebooks/examples/sql/QuickstartNotebook.ipynb) |
| Shapefiles | Examples of reading multiple shapefiles | [python](notebooks/examples/python/Shapefiles/) |
| Spatial KNN | Runnable notebook-based example using Mosaic [SpatialKNN](https://databrickslabs.github.io/mosaic/models/spatial-knn.html) model | [python](notebooks/examples/python/SpatialKNN) |
| NetCDF | Read multiple NetCDFs, process through various data engineering steps before analyzing and rendering | [python](notebooks/examples/python/NetCDF/) |
| STS Transfers | Detecting Ship-to-Ship transfers at scale by leveraging Mosaic to process AIS data. | [python](notebooks/examples/python/Ship2ShipTransfers), [blog](https://medium.com/@timo.roest/ship-to-ship-transfer-detection-b370dd9d43e8) |
| EO Gridded STAC | End-to-end Earth Observation series showing downloading Sentinel-2 STAC assets for Alaska from [MSFT Planetary Computer](https://planetarycomputer.microsoft.com/), tiling them to H3 global grid, band stacking, NDVI, merging (mosaicing), clipping, and applying a [Segment Anything Model](https://huggingface.co/facebook/sam-vit-huge) | [python](notebooks/examples/python/EarthObservation/EOGriddedSTAC) |
| Example | Description | Links |
| --- | --- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| __Quick Start__ | Example of performing spatial point-in-polygon joins on the NYC Taxi dataset | [python](/notebooks/examples/python/Quickstart/QuickstartNotebook.ipynb), [scala](notebooks/examples/scala/QuickstartNotebook.ipynb), [R](notebooks/examples/R/QuickstartNotebook.r), [SQL](notebooks/examples/sql/QuickstartNotebook.ipynb) |
| Shapefiles | Examples of reading multiple shapefiles | [python](notebooks/examples/python/Shapefiles/) |
| Spatial KNN | Runnable notebook-based example using Mosaic [SpatialKNN](https://databrickslabs.github.io/mosaic/models/spatial-knn.html) model | [python](notebooks/examples/python/SpatialKNN) |
| NetCDF | Read multiple NetCDFs, process through various data engineering steps before analyzing and rendering | [python](notebooks/examples/python/NetCDF/) |
| STS Transfers | Detecting Ship-to-Ship transfers at scale by leveraging Mosaic to process AIS data. | [python](notebooks/examples/python/Ship2ShipTransfers), [blog](https://medium.com/@timo.roest/ship-to-ship-transfer-detection-b370dd9d43e8) |
| EO Gridded STAC | End-to-end Earth Observation series showing downloading Sentinel-2 STAC assets for Alaska from [MSFT Planetary Computer](https://planetarycomputer.microsoft.com/), tiling them to H3 global grid, band stacking, NDVI, merging (mosaicing), clipping, and applying a [Segment Anything Model](https://huggingface.co/facebook/sam-vit-huge) | [python](notebooks/examples/python/EarthObservation/EOGriddedSTAC) |

You can import those examples in Databricks workspace using [these instructions](https://docs.databricks.com/en/notebooks/index.html).

Expand Down
7 changes: 4 additions & 3 deletions docs/source/api/raster-format-readers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,9 @@ Mosaic provides spark readers for the following raster formats:
Other formats are supported if supported by GDAL available drivers.

Mosaic provides two flavors of the readers:
* spark.read.format("gdal") for reading 1 file per spark task
* mos.read().format("raster_to_grid") reader that automatically converts raster to grid.

* :code:`spark.read.format("gdal")` for reading 1 file per spark task
* :code: `mos.read().format("raster_to_grid")` reader that automatically converts raster to grid.


spark.read.format("gdal")
Expand Down Expand Up @@ -91,7 +92,7 @@ mos.read().format("raster_to_grid")
***********************************
Reads a GDAL raster file and converts it to a grid.
It uses a pattern similar to standard spark.read.format(*).option(*).load(*) pattern.
The only difference is that it uses mos.read() instead of spark.read().
The only difference is that it uses :code:`mos.read()` instead of :code:`spark.read()`.
The raster pixels are converted to grid cells using specified combiner operation (default is mean).
If the raster pixels are larger than the grid cells, the cell values can be calculated using interpolation.
The interpolation method used is Inverse Distance Weighting (IDW) where the distance function is a k_ring
Expand Down
33 changes: 18 additions & 15 deletions docs/source/api/raster-functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,20 +6,22 @@ Intro
################
Raster functions are available in mosaic if you have installed the optional dependency `GDAL`.
Please see :doc:`Install and Enable GDAL with Mosaic </usage/install-gdal>` for installation instructions.
Mosaic provides several unique raster functions that are not available in other Spark packages.
Mainly raster to grid functions, which are useful for reprojecting the raster data into a standard grid index system.
This is useful for performing spatial joins between raster data and vector data.
Mosaic also provides a scalable retiling function that can be used to retile raster data in case of bottlenecking due to large files.
All raster functions respect the \"rst\_\" prefix naming convention.
Mosaic is operating using raster tile objects only since 0.3.11. Tile objects are created using functions such as rst_fromfile(path_to_raster)
or rst_fromcontent(raster_bin, driver). These functions are used as places to start when working with initial data.
If you use spark.read.format("gdal") tiles are automatically generated for you.
Also, scala does not have a df.display method while python does. In practice you would most often call display(df) in
scala for a prettier output, but for brevity, we write df.show in scala.

* Mosaic provides several unique raster functions that are not available in other Spark packages.
Mainly raster to grid functions, which are useful for reprojecting the raster data into a standard grid index
system. This is useful for performing spatial joins between raster data and vector data.
* Mosaic also provides a scalable retiling function that can be used to retile raster data in case of bottlenecking
due to large files.
* All raster functions respect the :code:`rst_` prefix naming convention.
* Mosaic is operating using raster tile objects only since 0.3.11. Tile objects are created using functions such as
:code:`rst_fromfile` or :code:`rst_fromcontent`. These functions are used as places to start when working with
initial data. If you use :code:`spark.read.format("gdal")` tiles are automatically generated for you.
* Also, scala does not have a :code:`df.display()` method while python does. In practice you would most often call
:code:`display(df)` in scala for a prettier output, but for brevity, we write :code:`df.show` in scala.

.. note:: For mosaic versions > 0.4.0 you can use the revamped setup_gdal function or new setup_fuse_install.
These functions will configure an init script in your preferred Workspace, Volume, or DBFS location to install GDAL on your cluster.
See :doc:`Install and Enable GDAL with Mosaic </usage/install-gdal>` for more details.
These functions will configure an init script in your preferred Workspace, Volume, or DBFS location to install GDAL
on your cluster. See :doc:`Install and Enable GDAL with Mosaic </usage/install-gdal>` for more details.

rst_bandmetadata
****************
Expand Down Expand Up @@ -190,7 +192,7 @@ rst_combineavg
The output raster will have the same pixel type as the input rasters.
The output raster will have the same pixel size as the input rasters.
The output raster will have the same coordinate reference system as the input rasters.
Also, see :doc:`rst_combineavg_agg </api/spatial-aggregations>` function.
Also, see :doc:`rst_combineavg_agg </api/spatial-aggregations#rst-combineavg-agg>` function.

:param tiles: A column containing an array of raster tiles.
:type tiles: Column (ArrayType(RasterTileType))
Expand Down Expand Up @@ -244,7 +246,7 @@ rst_derivedband
The output raster will have the same pixel type as the input rasters.
The output raster will have the same pixel size as the input rasters.
The output raster will have the same coordinate reference system as the input rasters.
Also, see :doc:`rst_derivedband_agg </api/spatial-aggregations>` function.
Also, see :doc:`rst_derivedband_agg </api/spatial-aggregations#rst-derivedband-agg>` function.

:param tiles: A column containing an array of raster tiles.
:type tiles: Column (ArrayType(RasterTileType))
Expand Down Expand Up @@ -298,6 +300,7 @@ rst_derivedband
+----------------------------------------------------------------------------------------------------------------+

.. code-tab:: sql

SELECT
rst_derivedband(array(tile1,tile2,tile3)) as tiles,
"""
Expand Down Expand Up @@ -876,7 +879,7 @@ rst_merge
The output raster will have the same pixel type as the input rasters.
The output raster will have the same pixel size as the highest resolution input rasters.
The output raster will have the same coordinate reference system as the input rasters.
Also, see :doc:`rst_merge_agg </api/spatial-aggregations>` function.
Also, see :doc:`rst_merge_agg </api/spatial-aggregations#rst-merge-agg>` function.

:param tiles: A column containing an array of raster tiles.
:type tiles: Column (ArrayType(RasterTileType))
Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/spatial-aggregations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -211,7 +211,7 @@ st_intersects_aggregate

.. function:: st_intersects_agg(leftIndex, rightIndex)

Returns `true` if any of the `leftIndex` and `rightIndex` pairs intersect.
Returns :code:`true` if any of the :code:`leftIndex` and :code:`rightIndex` pairs intersect.

:param leftIndex: Geometry
:type leftIndex: Column
Expand Down Expand Up @@ -301,7 +301,7 @@ st_intersection_agg

.. function:: st_intersection_agg(leftIndex, rightIndex)

Computes the intersections of `leftIndex` and `rightIndex` and returns the union of these intersections.
Computes the intersections of :code:`leftIndex` and :code:`rightIndex` and returns the union of these intersections.

:param leftIndex: Geometry
:type leftIndex: Column
Expand Down
Loading

0 comments on commit 9e01d87

Please sign in to comment.