Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RFC101 text: Raster dataset read-only thread-safety #10676

Merged
merged 9 commits into from
Sep 19, 2024

Conversation

rouault
Copy link
Member

@rouault rouault commented Aug 29, 2024

@rouault rouault marked this pull request as draft August 29, 2024 11:29
@rouault rouault force-pushed the rfc101_text branch 2 times, most recently from 74d3915 to f7a0fba Compare August 29, 2024 12:00
Copy link
Collaborator

@elpaso elpaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I am just wondering whether this approach would play well with a "native" thread safe driver (assuming that in the future we will have one of them). Would it be possible for GDALCreateThreadSafeDataset to return the original poDS in that case without cloning?

.. code-block:: c

bool GDALDatasetIsThreadSafe(GDALDatasetH hDS, int nScopeFlags,
CSLConstList papszOptions);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the options are required here and not by the C++ version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The C++ version could easily be extended in a backwards compatible way by adding an optional parameter if we need it, but that would break the C API, and this provision to avoid having to do a GDALDatasetIsThreadSafeEx() in the future

source dataset can be re-opened by its name (GetDescription()), which is the
case for datasets opened by GDALOpenEx(). A special implementation is also
made for dataset instances of the MEM driver. But, there is currently no
support for creating a thread-safe dataset wrapper on on-the-fly datasets
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please elaborate the difference between MEM and on-the-fly (or give a pointer to the docs)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added "(e.g GDALTranslate() or GDALWarp() with VRT as the output driver and with an empty filename, or custom GDALDataset implementation by external code)."

``std::unique_ptr<GDALRasterBand> GDALCreateThreadSafeRasterBand(GDALRasterBand* poBand, int nOpenFlags, bool bForce)``
function that would try to use GDALCreateThreadSafeDataset() internally if it
manages to identify the dataset to which the band belongs to, and otherwise would
fallback to protecting calls to the wrapped band with a mutex.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a good idea to me, provided that there are not any significant performance degradations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the mutex, there would definitely be serious performance degradation if most of the time is spent in reading th dataset. The only point would be ease of implementation for users that don't want to have to implement both a multi-thread efficient code path and a serialized code path. For some algorithms, input pixel acquisition might not be that consuming regarding other processing done, so that could still be worthwhile. This really depends on use cases.

@rouault
Copy link
Member Author

rouault commented Sep 2, 2024

Would it be possible for GDALCreateThreadSafeDataset to return the original poDS in that case without cloning?

For std::unique_ptr<GDALDataset> GDALCreateThreadSafeDataset(std::unique_ptr<GDALDataset> poDS, int nScopeFlags);, yes this could be trivially done by just returning poDS

For std::unique_ptr<GDALDataset> GDALCreateThreadSafeDataset(GDALDataset* poDS, int nScopeFlags);, there would be an issue regarding referencing counting, since this signature assumes the returned dataset is not the input one. It would need to be changed to GDALDataset* GDALCreateThreadSafeDataset(GDALDataset* poDS, int nScopeFlags);, still increase the reference count if returning itself, and the user will have to use ReleaseRef() on both the input and output datasets. But that makes sense and it is more symmetric.

@rouault
Copy link
Member Author

rouault commented Sep 2, 2024

It would need to be changed to GDALDataset* GDALCreateThreadSafeDataset(GDALDataset* poDS, int nScopeFlags);

changed in RFC text and candidate implementation

…set *poDS, int nScopeFlags) to return a GDALDataset*
@lnicola
Copy link
Contributor

lnicola commented Sep 6, 2024

This looks pretty good! I don't really like the implicit opening on access of the inner datasets, but it does make the APl much easier to use compared to explicit opening.

It might be worth adding a line or two with the rationale for this (unless I've missed it), for anyone reading this in the future.

* Even updating a reduced number of drivers would be extremely difficult, in
particular the GeoTIFF one, due to the underlying library not being reentrant,
and deferred loading strategies and many state variables being modified even
by read-only APIs. And this applies to most typical drivers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be feasible for Zarr?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be feasible for Zarr?

There are quite a number of state variables. For example opening a group or array adds entries to maps. Even reading pixels in a array involves some caches. So, likely much less difficult than GeoTIFF, but still not a trivial effort (and really hard to convince oneself that mutexes have been taken in all places where they should be)

@rouault
Copy link
Member Author

rouault commented Sep 6, 2024

It might be worth adding a line or two with the rationale for this (unless I've missed it), for anyone reading this in the future.

I believe the rationale is given in the "design discussion" section where the "natural" alternative of making the drivers themselves thread-safe is mentioned and considered unfeasible for pretty much all drivers. Hence, unless you would protect all calls to the "master" dataset with a mutex (which doesn't scale), there isn't much choice than re-opening them

@lnicola
Copy link
Contributor

lnicola commented Sep 6, 2024

where the "natural" alternative of making the drivers them thread-selves

Sorry, I meant the alternate design where you know you have N threads and call a hypothetical GDALGetThreadSafeDataset N times, instead of doing it lazily, on the first access. This would make it unnecessary to use thread-local storage and maintain the LRU mentioned in the RFC.

Of course, that would make it much harder to use with automatic parallelization (parallel_for) libraries.


vs

$ time multireadtest -thread_safe -t 4 -i 100 4096x4096.tif
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this is faster, since it's doing pretty much the same thing.

Perhaps it's the effect of removing the CPLSleep call?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps it's the effect of removing the CPLSleep call?

The removal of the CPLSleep call applies to the "multireadtest -t 4" case too. But just retrying time measurement, I can see there's significant variation during runs, and sometimes classic is faster, sometimes pretty similar, so the above conclusion is not super relevant.

Copy link
Member Author

@rouault rouault Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to "But on a 4096x4096 raster with a number of iterations reduced to 100, the timings between the default and thread_safe modes are very similar."

@rouault
Copy link
Member Author

rouault commented Sep 6, 2024

Sorry, I meant the alternate design where you know you have N threads and call a hypothetical GDALGetThreadSafeDataset N times, instead of doing it lazily, on the first access. This would make it unnecessary to use thread-local storage and maintain the LRU mentioned in the RFC.

isn't that what is mentioned in the motivation section: "Currently, given a GDALDataset instance is not thread-safe, this requires either to deal with I/O in a single thread, or through a mutex to protect against concurrent use, or one needs to open a separate GDALDataset for each worker thread." ?

@rouault
Copy link
Member Author

rouault commented Sep 6, 2024

I'm wondering if GDALGetThreadSafeDataset wouldn't be a better name than GDALCreateThreadSafeDataset ? For 2 reasons:

  • "create" could somehow imply we create/write a file
  • and when it is called on an already thread-safe dataset, it returns itself, so it doesn't create anything at all, even from a C++ object instantiation point of view

@rouault
Copy link
Member Author

rouault commented Sep 7, 2024

I'm wondering if GDALGetThreadSafeDataset wouldn't be a better name than GDALCreateThreadSafeDataset

I've opted for GDALGetThreadSafeDataset

@rouault rouault marked this pull request as ready for review September 19, 2024 20:54
@rouault rouault merged commit 817899c into OSGeo:master Sep 19, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants