-
I'm trying to track down the cause of some strange behavior in First, the data with no masking >>> rx.open_rasterio('test.tif')
<xarray.DataArray (band: 1, y: 7527, x: 5186)>
[39035022 values with dtype=int32]
Coordinates:
* band (band) int64 1
* x (x) float64 -1.419e+06 -1.419e+06 ... -1.264e+06 -1.264e+06
* y (y) float64 3.025e+06 3.025e+06 ... 2.799e+06 2.799e+06
spatial_ref int64 0
Attributes:
scale_factor: 1.0
add_offset: 0.0
long_name: elevation
>>> rx.open_rasterio('test.tif').dtype
dtype('int32') Now masked >>> rx.open_rasterio('test.tif', masked=True)
<xarray.DataArray (band: 1, y: 7527, x: 5186)>
[39035022 values with dtype=float64]
Coordinates:
* band (band) int64 1
* x (x) float64 -1.419e+06 -1.419e+06 ... -1.264e+06 -1.264e+06
* y (y) float64 3.025e+06 3.025e+06 ... 2.799e+06 2.799e+06
spatial_ref int64 0
Attributes:
scale_factor: 1.0
add_offset: 0.0
long_name: elevation
>>> rx.open_rasterio('test.tif', masked=True).dtype
dtype('float64') So far so good. I should note that there isn't really anything masked in this image, but I'm not sure that's relevant. i.e. >>> rx.open_rasterio('test.tif').values
array([[[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
...,
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0]]], dtype=int32)
>>> rx.open_rasterio('test.tif', masked=True).values
array([[[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
...,
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.],
[0., 0., 0., ..., 0., 0., 0.]]]) Now the clipping. If I load a shape and clip to the unmasked raster, this is the result >>> rx.open_rasterio('test.tif').rio.clip(aoi.geometry)
<xarray.DataArray (band: 1, y: 7525, x: 5185)>
array([[[-2147483648, -2147483648, -2147483648, ..., -2147483648,
-2147483648, -2147483648],
[-2147483648, -2147483648, -2147483648, ..., -2147483648,
-2147483648, -2147483648],
[-2147483648, -2147483648, -2147483648, ..., -2147483648,
-2147483648, -2147483648],
...,
[-2147483648, -2147483648, -2147483648, ..., -2147483648,
-2147483648, -2147483648],
[-2147483648, -2147483648, -2147483648, ..., -2147483648,
-2147483648, -2147483648],
[-2147483648, -2147483648, -2147483648, ..., -2147483648,
-2147483648, -2147483648]]], dtype=int32)
Coordinates:
* band (band) int64 1
* x (x) float64 -1.419e+06 -1.419e+06 ... -1.264e+06 -1.264e+06
* y (y) float64 3.025e+06 3.025e+06 ... 2.799e+06 2.799e+06
spatial_ref int64 0
Attributes:
scale_factor: 1.0
add_offset: 0.0
long_name: elevation I take the masked values as an artifact of the input raster being integer type. Not my preferred behavior, but OK. Now l clip the masked version >>> rx.open_rasterio('test.tif', masked=True).rio.clip(aoi.geometry)
<xarray.DataArray (band: 1, y: 7525, x: 5185)>
array([[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]]])
Coordinates:
* band (band) int64 1
* x (x) float64 -1.419e+06 -1.419e+06 ... -1.264e+06 -1.264e+06
* y (y) float64 3.025e+06 3.025e+06 ... 2.799e+06 2.799e+06
spatial_ref int64 0
Attributes:
scale_factor: 1.0
add_offset: 0.0
long_name: elevation This is the correct output, or at least what I would expect to be the correct output. Masked areas are assigned >>> rx.open_rasterio('test.tif', masked=True).rio.clip(aoi.geometry, from_disk=True)
<xarray.DataArray (band: 1, y: 7527, x: 5186)>
array([[[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
...,
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0],
[0, 0, 0, ..., 0, 0, 0]]], dtype=int32)
Coordinates:
* x (x) float64 -1.419e+06 -1.419e+06 ... -1.264e+06 -1.264e+06
* y (y) float64 3.025e+06 3.025e+06 ... 2.799e+06 2.799e+06
* band (band) int64 1
spatial_ref int64 0
Attributes:
scale_factor: 1.0
add_offset: 0.0
long_name: elevation My interpretation of this is that >>> rx.open_rasterio('test.tif', masked=True).astype('float64').rio.clip(aoi.geometry, from_disk=True)
<xarray.DataArray (band: 1, y: 7525, x: 5185)>
array([[[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]]])
Coordinates:
* band (band) int64 1
* x (x) float64 -1.419e+06 -1.419e+06 ... -1.264e+06 -1.264e+06
* y (y) float64 3.025e+06 3.025e+06 ... 2.799e+06 2.799e+06
spatial_ref int64 0
Attributes:
scale_factor: 1.0
add_offset: 0.0
long_name: elevation If I cast the dataarray before the clip, What am I missing? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
This is a really great analysis into the current behavior. Here are some answers that will hopefully provide some insight. Please follow up if you have further questions or I missed something.
If you use
https://rasterio.readthedocs.io/en/latest/api/rasterio.mask.html "If there is no set nodata value for the raster, it defaults to 0." It is likely your raster does not have a nodata value set.
|
Beta Was this translation helpful? Give feedback.
This is a really great analysis into the current behavior. Here are some answers that will hopefully provide some insight. Please follow up if you have further questions or I missed something.
If you use
from_disk=True
, you need to do it immediately after opening the raster.rio.clip
falls back tofrom_disk=False
if it cannot find the handle to the original raster. The handle is easily lost if you perform operations on the DataArray. The handle is likely being lost when you callastype
after opening the raster and so it is falling back…