Skip to content

Commit

Permalink
Merge pull request #521 from mjohns-databricks/mjohns-0.4.0-docs-5
Browse files Browse the repository at this point in the history
Mjohns 0.4.0 docs 5
  • Loading branch information
Milos Colic authored Jan 26, 2024
2 parents 9e01d87 + 827f3a8 commit 6b9a8f9
Show file tree
Hide file tree
Showing 11 changed files with 101 additions and 60 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,9 +57,9 @@ __Language Bindings__

As of Mosaic 0.4.0 (subject to change in follow-on releases)...

* _Mosaic SQL expressions cannot yet be registered with [Unity Catalog](https://www.databricks.com/product/unity-catalog) due to API changes affecting DBRs >= 13._
* [Assigned Clusters](https://docs.databricks.com/en/compute/configure.html#access-modes): Mosaic Python, R, and Scala APIs.
* [Assigned Clusters](https://docs.databricks.com/en/compute/configure.html#access-modes): Mosaic Python, SQL, R, and Scala APIs.
* [Shared Access Clusters](https://docs.databricks.com/en/compute/configure.html#access-modes): Mosaic Scala API (JVM) with Admin [allowlisting](https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/allowlist.html); _Python bindings to Mosaic Scala APIs are blocked by Py4J Security on Shared Access Clusters._
* Mosaic SQL expressions cannot yet be registered with [Unity Catalog](https://www.databricks.com/product/unity-catalog) due to API changes affecting DBRs >= 13, more [here](https://docs.databricks.com/en/udf/index.html).

__Additional Notes:__

Expand Down Expand Up @@ -141,7 +141,7 @@ import com.databricks.labs.mosaic.JTS
val mosaicContext = MosaicContext.build(H3, JTS)
mosaicContext.register(spark)
```
__Note: Mosaic 0.4.x SQL bindings for DBR 13 not yet available in Unity Catalog due to API changes.__
__Note: Mosaic 0.4.x SQL bindings for DBR 13 can register with Assigned clusters, but not Shared Access due to API changes, more [here](https://docs.databricks.com/en/udf/index.html).__

## Examples

Expand Down
22 changes: 16 additions & 6 deletions docs/source/api/raster-format-readers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ Other formats are supported if supported by GDAL available drivers.
Mosaic provides two flavors of the readers:

* :code:`spark.read.format("gdal")` for reading 1 file per spark task
* :code: `mos.read().format("raster_to_grid")` reader that automatically converts raster to grid.
* :code:`mos.read().format("raster_to_grid")` reader that automatically converts raster to grid.


spark.read.format("gdal")
Expand All @@ -48,10 +48,10 @@ The output of the reader is a DataFrame with the following columns:
.. function:: spark.read.format("gdal").load(path)

Loads a GDAL raster file and returns the result as a DataFrame.
It uses standard spark.read.format(*).option(*).load(*) pattern.
It uses standard :code:`spark.read.format(*).option(*).load(*)` pattern.

:param path: path to the raster file on dbfs
:type path: *StringType
:type path: Column(StringType)
:rtype: DataFrame

:example:
Expand Down Expand Up @@ -81,6 +81,11 @@ The output of the reader is a DataFrame with the following columns:
| {index_id: 593308294097928191, raster: [00 01 10 ... 00], parentPath: "dbfs:/path_to_file", driver: "GTiff" } | 100 | 100 | 1 | {AREA_OR_POINT=Po...| null| 4326| +proj=longlat +da...|
+---------------------------------------------------------------------------------------------------------------+------+------+----------+---------------------+--------------------+-----+----------------------+

.. note::
Keyword options not identified in function signature are converted to a :code:`Map<String,String>`.
These must be supplied as a :code:`String`.
Also, you can supply function signature values as :code:`String`.

.. warning::
Issue 350: https://github.com/databrickslabs/mosaic/issues/350
The raster reader 'driverName' option has to match the names provided in the above list.
Expand Down Expand Up @@ -113,10 +118,10 @@ The reader supports the following options:
.. function:: mos.read().format("raster_to_grid").load(path)

Loads a GDAL raster file and returns the result as a DataFrame.
It uses standard mos.read().format(*).option(*).load(*) pattern.
It uses standard :code:`mos.read().format(*).option(*).load(*)` pattern.

:param path: path to the raster file on dbfs
:type path: *StringType
:type path: Column(StringType)
:rtype: DataFrame

:example:
Expand Down Expand Up @@ -162,7 +167,12 @@ The reader supports the following options:
| 1| 4|0.2464000000000000|
+--------+--------+------------------+
.. note::
Keyword options not identified in function signature are converted to a :code:`Map<String,String>`.
These must be supplied as a :code:`String`.
Also, you can supply function signature values as :code:`String`.

.. warning::
Issue 350: https://github.com/databrickslabs/mosaic/issues/350
The option 'fileExtension' expects a wild card mask. Please use the following format: '*.tif' or equivalent for other formats.
If you use 'tif' without the wildcard the reader wont pick up any files and you will have empty table as a result.
If you use 'tif' without the wildcard the reader wont pick up any files and you will have empty table as a result.
6 changes: 3 additions & 3 deletions docs/source/api/raster-functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ rst_combineavg
The output raster will have the same pixel type as the input rasters.
The output raster will have the same pixel size as the input rasters.
The output raster will have the same coordinate reference system as the input rasters.
Also, see :doc:`rst_combineavg_agg </api/spatial-aggregations#rst-combineavg-agg>` function.
Also, see :ref:`rst_combineavg_agg` function.

:param tiles: A column containing an array of raster tiles.
:type tiles: Column (ArrayType(RasterTileType))
Expand Down Expand Up @@ -246,7 +246,7 @@ rst_derivedband
The output raster will have the same pixel type as the input rasters.
The output raster will have the same pixel size as the input rasters.
The output raster will have the same coordinate reference system as the input rasters.
Also, see :doc:`rst_derivedband_agg </api/spatial-aggregations#rst-derivedband-agg>` function.
Also, see :ref:`rst_derivedband_agg` function.

:param tiles: A column containing an array of raster tiles.
:type tiles: Column (ArrayType(RasterTileType))
Expand Down Expand Up @@ -879,7 +879,7 @@ rst_merge
The output raster will have the same pixel type as the input rasters.
The output raster will have the same pixel size as the highest resolution input rasters.
The output raster will have the same coordinate reference system as the input rasters.
Also, see :doc:`rst_merge_agg </api/spatial-aggregations#rst-merge-agg>` function.
Also, see :ref:`rst_merge_agg` function.

:param tiles: A column containing an array of raster tiles.
:type tiles: Column (ArrayType(RasterTileType))
Expand Down
18 changes: 8 additions & 10 deletions docs/source/api/spatial-functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ st_bufferloop

.. function:: st_bufferloop(col, innerRadius, outerRadius)

Returns a difference between st_buffer(col, outerRadius) and st_buffer(col, innerRadius).
Returns a difference between :code:`st_buffer(col, outerRadius)` and :code:`st_buffer(col, innerRadius)`.
The resulting geometry is a loop with a width of outerRadius - innerRadius.

:param col: Geometry
Expand Down Expand Up @@ -930,7 +930,7 @@ st_intersection
.. function:: st_intersection(geom1, geom2)

Returns a geometry representing the intersection of :code:`left_geom` and :code:`right_geom`.
Also, see :doc:`st_intersection_agg </api/spatial-aggregations#st-intersection-agg>` function.
Also, see :ref:`st_intersection_agg` function.

:param geom1: Geometry
:type geom1: Column
Expand Down Expand Up @@ -1411,7 +1411,7 @@ st_setsrid
+---------------------------------+

.. note::
ST_SetSRID does not transform the coordinates of :code:`geom`,
:ref:`st_setsrid` does not transform the coordinates of :code:`geom`,
rather it tells Mosaic the SRID in which the current coordinates are expressed.
:ref:`st_setsrid` can only operate on geometries encoded in GeoJSON.

Expand Down Expand Up @@ -1470,8 +1470,6 @@ st_simplify
| LINESTRING (0 1, 1 2, 3 0) |
+----------------------------+

.. note::
The specified tolerance will be ignored by the ESRI geometry API.

st_srid
*******
Expand Down Expand Up @@ -1531,7 +1529,7 @@ st_srid
+------------+

.. note::
ST_SRID can only operate on geometries encoded in GeoJSON.
:ref:`st_srid` can only operate on geometries encoded in GeoJSON.


st_transform
Expand Down Expand Up @@ -1578,7 +1576,7 @@ st_transform

.. code-tab:: sql

select st_astext(st_transform(st_setsrid(st_geomfromwkt("MULTIPOINT ((10 40), (40 30), (20 20), (30 10))"), 4326) as geom, 3857))
select st_astext(st_transform(st_setsrid(st_asgeojson("MULTIPOINT ((10 40), (40 30), (20 20), (30 10))"), 4326) as geom, 3857))
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|convert_to(st_transform(geom, 3857)) |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Expand All @@ -1600,8 +1598,8 @@ st_transform
.. note::
If :code:`geom` does not have an associated SRID, use :ref:`st_setsrid` to set this before calling :ref:`st_transform`.
**Changed in 0.4 series** :ref:`st_srid`, :ref:`st_setsrid`, and :ref:`st_transform` only operate on
GeoJSON (columnar) data, so be sure to call :ref:`/api/geometry-accessors#st_asgeojson` to convert from WKT and WKB. You can convert
back after the transform, e.g. using :ref:`/api/geometry-accessors#st_astext` or :ref:`/api/geometry-accessors#st_asbinary`.
GeoJSON (columnar) data, so be sure to call :ref:`st_asgeojson` to convert from WKT and WKB. You can convert
back after the transform, e.g. using :ref:`st_astext` or :ref:`st_asbinary`.


st_translate
Expand Down Expand Up @@ -1667,7 +1665,7 @@ st_union
.. function:: st_union(left_geom, right_geom)

Returns the point set union of the input geometries.
Also, see :doc:`st_union_agg </api/spatial-aggregations#st-union-agg>` function.
Also, see :ref:`st_union_agg` function.

:param left_geom: Geometry
:type left_geom: Column
Expand Down
4 changes: 2 additions & 2 deletions docs/source/api/spatial-indexing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -855,7 +855,7 @@ grid_cell_intersection
.. function:: grid_cell_intersection(left_chip, right_chip)

Returns the chip representing the intersection of two chips based on the same grid cell.
Also, see :doc:`grid_cell_intersection_agg </api/spatial-aggregations#grid-cell-intersection-agg>` function.
Also, see :ref:`grid_cell_intersection_agg` function.

:param left_chip: Chip
:type left_chip: Column: ChipType(LongType)
Expand Down Expand Up @@ -911,7 +911,7 @@ grid_cell_union
.. function:: grid_cell_union(left_chip, right_chip)

Returns the chip representing the union of two chips based on the same grid cell.
Also, see :doc:`grid_cell_union_agg </api/spatial-aggregations#grid-cell-union-agg>` function.
Also, see :ref:`grid_cell_union_agg` function.

:param left_chip: Chip
:type left_chip: Column: ChipType(LongType)
Expand Down
2 changes: 1 addition & 1 deletion docs/source/api/spatial-predicates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ st_intersects
.. function:: st_intersects(geom1, geom2)

Returns true if the geometry :code:`geom1` intersects :code:`geom2`.
Also, see :doc:`st_intersects_agg </api/spatial-aggregations#st-intersects-agg>` function.
Also, see :ref:`st_intersects_agg` function.

:param geom1: Geometry
:type geom1: Column
Expand Down
24 changes: 23 additions & 1 deletion docs/source/api/vector-format-readers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ The reader supports the following options:
Loads a vector file and returns the result as a :class:`DataFrame`.

:param path: the path of the vector file
:type path: Column(StringType)
:return: :class:`DataFrame`

:example:
Expand Down Expand Up @@ -97,6 +98,11 @@ The reader supports the following options:
| "description"| 3| ... | POINT (3.0 3.0) | 4326|
+--------------------+-------+-----+-----------------+-----------+

.. note::
Keyword options not identified in function signature are converted to a :code:`Map<String,String>`.
These must be supplied as a :code:`String`.
Also, you can supply function signature values as :code:`String`.


mos.read().format("multi_read_ogr")
***********************************
Expand Down Expand Up @@ -128,6 +134,7 @@ and parsed into expected types on execution. The reader supports the following o
Loads a vector file and returns the result as a :class:`DataFrame`.

:param path: the path of the vector file
:type path: Column(StringType)
:return: :class:`DataFrame`

:example:
Expand Down Expand Up @@ -164,6 +171,9 @@ and parsed into expected types on execution. The reader supports the following o
| "description"| 3| ... | POINT (3.0 3.0) | 4326|
+--------------------+-------+-----+-----------------+-----------+

.. note::
All options are converted to a :code:`Map<String,String>` and must be supplied as a :code:`String`.


spark.read.format("geo_db")
*****************************
Expand All @@ -182,6 +192,7 @@ The reader supports the following options:
Loads a GeoDB file and returns the result as a :class:`DataFrame`.

:param path: the path of the GeoDB file
:type path: Column(StringType)
:return: :class:`DataFrame`

:example:
Expand Down Expand Up @@ -217,6 +228,11 @@ The reader supports the following options:
| "description"| 3| ... | POINT (3.0 3.0) | 4326|
+--------------------+-------+-----+-----------------+-----------+

.. note::
Keyword options not identified in function signature are converted to a :code:`Map<String,String>`.
These must be supplied as a :code:`String`.
Also, you can supply function signature values as :code:`String`.


spark.read.format("shapefile")
********************************
Expand All @@ -235,6 +251,7 @@ The reader supports the following options:
Loads a Shapefile and returns the result as a :class:`DataFrame`.

:param path: the path of the Shapefile
:type path: Column(StringType)
:return: :class:`DataFrame`

:example:
Expand Down Expand Up @@ -268,4 +285,9 @@ The reader supports the following options:
| "description"| 1| ... | POINT (1.0 1.0) | 4326|
| "description"| 2| ... | POINT (2.0 2.0) | 4326|
| "description"| 3| ... | POINT (3.0 3.0) | 4326|
+--------------------+-------+-----+-----------------+-----------+
+--------------------+-------+-----+-----------------+-----------+

.. note::
Keyword options not identified in function signature are converted to a :code:`Map<String,String>`.
These must be supplied as a :code:`String`.
Also, you can supply function signature values as :code:`String`.
23 changes: 14 additions & 9 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ We currently recommend using Databricks Runtime with Photon enabled;
this will leverage the Databricks H3 expressions when using H3 grid system.

Mosaic provides:

* easy conversion between common spatial data encodings (WKT, WKB and GeoJSON);
* constructors to easily generate new geometries from Spark native data types;
* many of the OGC SQL standard :code:`ST_` functions implemented as Spark Expressions for transforming, aggregating and joining spatial datasets;
Expand All @@ -55,27 +56,30 @@ Mosaic provides:
.. note::
For Mosaic versions < 0.4 please use the `0.3 docs <https://databrickslabs.github.io/mosaic/v0.3.x/index.html>`_.

.. warning::
At times, it is useful to "hard refresh" pages to ensure your cached local version matches the latest live,
more `here <https://www.howtogeek.com/672607/how-to-hard-refresh-your-web-browser-to-bypass-your-cache/>`_.

Version 0.4.x Series
====================

We recommend using Databricks Runtime versions 13.3 LTS with Photon enabled.

.. warning::
Mosaic 0.4.x series only supports DBR 13.x DBRs.
If running on a different DBR it will throw an exception:

**DEPRECATION ERROR: Mosaic v0.4.x series only supports Databricks Runtime 13. You can specify `%pip install 'databricks-mosaic<0.4,>=0.3'` for DBR < 13.**
Mosaic 0.4.x series only supports DBR 13.x DBRs. If running on a different DBR it will throw an exception:

.. warning::
Mosaic 0.4.x series issues the following ERROR on a standard, non-Photon cluster `ADB <https://learn.microsoft.com/en-us/azure/databricks/runtime/>`_ | `AWS <https://docs.databricks.com/runtime/index.html/>`_ | `GCP <https://docs.gcp.databricks.com/runtime/index.html/>`_ :
**DEPRECATION ERROR: Mosaic v0.4.x series only supports Databricks Runtime 13. You can specify `%pip install 'databricks-mosaic<0.4,>=0.3'` for DBR < 13.**

**DEPRECATION ERROR: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial AI benefits; Mosaic 0.4.x series restricts executing this cluster.**
Mosaic 0.4.x series issues the following ERROR on a standard, non-Photon cluster `ADB <https://learn.microsoft.com/en-us/azure/databricks/runtime/>`_ | `AWS <https://docs.databricks.com/runtime/index.html/>`_ | `GCP <https://docs.gcp.databricks.com/runtime/index.html/>`_ :

**DEPRECATION ERROR: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial AI benefits; Mosaic 0.4.x series restricts executing this cluster.**

As of Mosaic 0.4.0 (subject to change in follow-on releases)
* No Mosaic SQL expressions cannot yet be registered with `Unity Catalog <https://www.databricks.com/product/unity-catalog>`_ due to API changes affecting DBRs >= 13.
* `Assigned Clusters <https://docs.databricks.com/en/compute/configure.html#access-modes>`_ : Mosaic Python, R, and Scala APIs.

* `Assigned Clusters <https://docs.databricks.com/en/compute/configure.html#access-modes>`_ : Mosaic Python, SQL, R, and Scala APIs.
* `Shared Access Clusters <https://docs.databricks.com/en/compute/configure.html#access-modes>`_ : Mosaic Scala API (JVM) with Admin `allowlisting <https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/allowlist.html>`_ ; Python bindings to Mosaic Scala APIs are blocked by Py4J Security on Shared Access Clusters.
- Mosaic SQL expressions cannot yet be registered with `Unity Catalog <https://www.databricks.com/product/unity-catalog>`_
due to API changes affecting DBRs >= 13, more `here <https://docs.databricks.com/en/udf/index.html>`_.

.. note::
As of Mosaic 0.4.0 (subject to change in follow-on releases)
Expand All @@ -96,6 +100,7 @@ For Mosaic versions < 0.4.0 please use the `0.3.x docs <https://databrickslabs.g
As of the 0.3.11 release, Mosaic issues the following WARNING when initialized on a cluster that is neither Photon Runtime nor Databricks Runtime ML `ADB <https://learn.microsoft.com/en-us/azure/databricks/runtime/>`_ | `AWS <https://docs.databricks.com/runtime/index.html/>`_ | `GCP <https://docs.gcp.databricks.com/runtime/index.html/>`_ :

**DEPRECATION WARNING: Please use a Databricks Photon-enabled Runtime for performance benefits or Runtime ML for spatial AI benefits; Mosaic will stop working on this cluster after v0.3.x.**

If you are receiving this warning in v0.3.11+, you will want to begin to plan for a supported runtime. The reason we are making this change is that we are streamlining Mosaic internals to be more aligned with future product APIs which are powered by Photon. Along this direction of change, Mosaic has standardized to JTS as its default and supported Vector Geometry Provider.


Expand Down
Loading

0 comments on commit 6b9a8f9

Please sign in to comment.