Chunkinfo #234

jreadey · 2023-06-06T19:19:06Z

Added arange chunk initiaiizer.

mattjala

LGTM

mattjala · 2023-06-07T18:07:57Z

hsds/datanode_lib.py

 from .util.hdf5dtype import createDataType, getItemSize

 from . import config
 from . import hsds_logger as log

+# supported initializer commands (just one at the moement)


More than one initializer command is now supported

mattjala · 2023-06-19T14:51:42Z

hsds/chunk_sn.py

+                chunk_index = getChunkIndex(chunk_id)
+                chunk_index = chunk_index[0]
+                log.debug(f"chunk_index: {chunk_index}")
+                for j in range(table_factor):


I think this is redundant, since table_factor is always a scalar.

I don't follow - we want to iterate through each hyperchunk for each hsds chunk...

mattjala · 2023-06-19T15:24:51Z

hsds/datanode_lib.py

@@ -829,6 +956,7 @@ async def get_chunk(

    # validate arguments
    if s3path:
+        """


Why remove this validation?

Looks like this is not relevant after latest code changes.

mattjala · 2023-06-19T15:30:21Z

hsds/util/chunkUtil.py

+        chunk_extent = chunk_dims[dim]
+
+        if dset_extent > 0 and chunk_extent > 0:
+            table_extent = -(dset_extent // -chunk_extent)


Why the double negative here - wouldn't the result be the same if both negatives were removed?

No, this is little trick to compute the integer ceiling.
E.g.: -(8 // -3) -> 3
8 // 3 -> 2

Ah, I see. Is there a reason not to use int(math.ceil(dset_extent // chunk_extent))? Other parts of HSDS already import math.

mattjala · 2023-06-20T14:28:36Z

hsds/util/chunkUtil.py

+
+def _find_min_pair(h5chunks, max_gap=None):
+    """ given a dict of chunk_map entries, return the two
+        chunks nearest to each other in the file.


return the two chunks -> return the indices of the two chunks

mattjala · 2023-06-20T14:30:35Z

hsds/chunk_dn.py

@@ -366,8 +366,8 @@ async def GET_Chunk(request):
            num_bytes = s3size
        else:
            # list
-            num_bytes = np.prod(s3size)
-        log.debug(f"reading {num_bytes} from {s3path}")
+            num_bytes = np.sum(s3size)


Why the sum instead of product? If s3size is a list, isn't each entry in the list a dimension describing the size of the data in s3?

no, it should be sum. In this case s3size is a colon separated list of range lengths, so the sum is the total number of bytes to read.

jreadey added 11 commits May 24, 2023 18:45

add hdf5 to Docker image

8163b4c

added IndexIterator

019c5fc

added chunklocator.py

1c13a7b

update SingleObject design

49d4969

add test case for chunk init

5347d0e

chunkinitializer support

834a0e8

fix flake8 error

e3247b5

filter out stray log messages from chunk initializer

76b9ae6

updates based on pr review

75960d2

added arange chunk initializer

c83e15c

fix falke8 errors

26c7238

jreadey requested a review from mattjala June 6, 2023 19:19

mattjala previously approved these changes Jun 7, 2023

View reviewed changes

mattjala reviewed Jun 7, 2023

View reviewed changes

mattjala added the enhancement label Jun 9, 2023

added intelligent range get support, remove range_proxy

5c7d884

jreadey dismissed mattjala’s stale review via 5c7d884 June 19, 2023 12:24

mattjala reviewed Jun 19, 2023

View reviewed changes

jreadey added 2 commits June 19, 2023 19:46

reduced debug log verbosity

9ede8e3

use hyper_dims creation property

ccdda94

mattjala reviewed Jun 20, 2023

View reviewed changes

jreadey added 6 commits June 20, 2023 17:24

added munger for rangeget requests

79e1988

flake8 cleanup

07dafad

updates per code review

ddca08d

fix comment

e582e9b

fix numpy warning

87ed37a

update aiobotocore to 2.5.0

84c0649

jreadey added 9 commits July 3, 2023 19:17

change urllib requirement

39391c8

match requirements.txt with setup.py

b996c8b

remove version restriction on urllib3

d949a1a

remove version restriction on botocore3

9b0fad6

remove urlib3 from dependencies

663627e

fix flake8 errors

36829ff

fix for numpy deprecation if truth value of empty array

115f088

fix flake8 errors

e8d32ed

more flake8 errors

b61bd8f

mattjala previously approved these changes Jul 10, 2023

View reviewed changes

merge changes from master

073c9ec

jreadey dismissed mattjala’s stale review via 073c9ec July 10, 2023 17:20

jreadey added 2 commits July 12, 2023 11:54

optimzie ndarray_compare for non-vlen arrays

80e74e2

fix flake8 error

4b29aa6

jreadey merged commit 6197314 into master Jul 12, 2023
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chunkinfo #234

Chunkinfo #234

jreadey commented Jun 6, 2023

mattjala left a comment

mattjala Jun 7, 2023

jreadey Jun 21, 2023

mattjala Jun 19, 2023

jreadey Jun 21, 2023

mattjala Jun 19, 2023

jreadey Jun 21, 2023

mattjala Jun 19, 2023

jreadey Jun 21, 2023

mattjala Jun 21, 2023

mattjala Jun 20, 2023

mattjala Jun 20, 2023

jreadey Jun 20, 2023

Chunkinfo #234

Chunkinfo #234

Conversation

jreadey commented Jun 6, 2023

mattjala left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment