Skip to content
Hyo-Kyung Lee edited this page Jan 11, 2023 · 20 revisions

DMR++ Generation

We used Hyrax Docker to generate DMR++.

build_dmrpp_metadata Attribute

DMR++ has an extra attribute build_dmrpp_metadata that the original HDF5 file doesn't have. DMR++ Attribute type can be Container, which Kerchunk doesn't support.

    <Attribute name="build_dmrpp_metadata" type="Container">
        <Attribute name="build_dmrpp" type="String">
            <Value>3.20.13-240</Value>
        </Attribute>
        <Attribute name="bes" type="String">
            <Value>3.20.13-240</Value>
        </Attribute>
        <Attribute name="libdap" type="String">
            <Value>libdap-3.20.11-74</Value>
        </Attribute>

pydap and DAP4

pydap has new dap4://opendap_url scheme. However, pydap seems completely broken for DAP4.

Bogus DAP2 Request

DAP4 request makes a bogus DAP2 request. Both SMAP and SWOT causes 500 error due to Int64 dataset.

webob.exc.HTTPError: 500 Internal Server Error
Error { 
    code = 500;
    message = "An internal error was encountered in D4Attributes.cc at line 297:
Unable to convert DAP4 attribute to DAP2. There is no accepted DAP2 representation of Int64.
Please report this to support@opendap.org";
}

Large Dataset with Unlimited Dimension

Reading a variable analysed_sst through dap4:// hangs for 20020602090000-JPL-L4_GHRSST-SSTfnd-MUR-GLOB-v02.0-fv04.1.h5.dmrpp dataset. The variable has 1x17999x36000 shape. Hyrax produces 4.5G CVS without any issue although it takes 10+ minutes.

All String type datasets are not available.

3A-MO.GPM.GMI.GRID2014R1.20140601-S000000-E235959.06.V03A.h5
/InputAlgorithmVersions
/InputFileNames
/InputGenerationDateTimes
AMSR_2_L3_DailySnow_P00_20160831.he5
/HDFEOS_INFORMATION/CoreMetadata_0
/HDFEOS_INFORMATION/StructMetadata_0
ATL08_20181014084920_02400109_003_01.h5
/ancillary_data/control
/ancillary_data/data_end_utc
/ancillary_data/data_start_utc
/ancillary_data/granule_end_utc
/ancillary_data/granule_start_utc
/ancillary_data/release
/ancillary_data/version
GLDAS_NOAH025_3H.A20210101.0000.021.nc4.h5
OMI-Aura_L2-OMNO2_2016m0215t0210-o61626_v003-2016m0215t200753.he5
/HDFEOS_INFORMATION/ArchivedMetadata_0
/HDFEOS_INFORMATION/CoreMetadata_0
/HDFEOS_INFORMATION/StructMetadata_0
SMAP_L3_SM_P_20150406_R14010_001.h5
SWOT_L2_HR_PIXC_007_483_235R_20220821T102608_20220821T102618_Dx0000_01.nc.h5
VNP09A1.A2015257.h29v11.001.2016221164845.h5
/HDFEOS_INFORMATION/StructMetadata_0

Dimension as Dataset

pydap generates extra dataset from DMR++ <Dimension>. DDS doesn't have it.

AMSR_2_L3_DailySnow_P00_20160831.he5.h5.dmrpp

/HDFEOS/GRIDS/Northern_Hemisphere/XDim
Traceback (most recent call last):
  File "/Users/hyoklee/miniconda3/lib/python3.9/site-packages/pydap/model.py", line 239, in __getattr__
    return self.attributes[attr]
KeyError: '_is_string_dtype'

GLDAS_NOAH025_3H.A20210101.0000.021.nc4.h5.dmrpp

bnds
Traceback (most recent call last):
  File "/Users/hyoklee/miniconda3/lib/python3.9/site-packages/pydap/model.py", line 239, in __getattr__
    return self.attributes[attr]
KeyError: '_is_string_dtype'

OMI-Aura_L2-OMNO2_2016m0215t0210-o61626_v003-2016m0215t200753.he5.h5.dmrpp

/HDFEOS/SWATHS/ColumnAmountNO2/nXtrack
Traceback (most recent call last):
  File "/Users/hyoklee/miniconda3/lib/python3.9/site-packages/pydap/model.py", line 239, in __getattr__
    return self.attributes[attr]
KeyError: '_is_string_dtype'

pydap and DAP2

pydap can't handle 3D dataset with unlimited dimension analysed_sst.

ValueError: buffer size must be a multiple of element size

Future Work

Implement a pure and robust Python DMR++ reader. Then, users can read HDF5 data directly from AWS S3 using the DMR++ reader like Kerchunk without relying on h5py.

Then, implement DMR++ handler for pydap server in Python using the Python DMR++ reader.