Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect sequential read #892

Closed
wants to merge 2 commits into from

Conversation

rbuffat
Copy link
Contributor

@rbuffat rbuffat commented Apr 24, 2020

I had a go to implement a check when a sequential read of an iterator is interrupted. (See #891) I'm not really satisfied with the result, but I think it could be used as a basis for discussion.

First some background from gdals documentation:

OGR_L_GetFeature:

Sequential reads (with OGR_L_GetNextFeature()) are generally considered interrupted by a OGR_L_GetFeature() call.

OGR_L_GetFeatureCount:

Note that some implementations of this method may alter the read cursor of the layer.

OGR_L_GetNextFeature is used in ogrext/Iterator. Iterator knows about the collection it was created from. Thus, the idea is to add a lock/flag to collection that is set if an iterator is started and released when he is finished. This flag is checked before OGR_L_GetFeature or OGR_L_GetFeatureCount is called. If a lock is set, a SequentialReadInterrupted exception is raised.

So far so good. But there are some issues with this approach. The first one is that an Iterator does not know when he is not used anymore. This is a problem for a membership test:

def test_membership_releases_lock(path_coutwildrnp_json):
    with fiona.open(path_coutwildrnp_json) as c:
        assert (0 in c)
        list(c)

Here 0 in c would not release the lock, as never a StopIteration condition is reached. This can be fixed by implementing contains in Iterator to ensure that the lock is released. But there might be a better way to do this.

I found another issue with OGR_L_GetFeatureCount:

path_gpx = "/home/rene/dev/Fiona/tests/data/test_gpx.gpx"
with fiona.open(path_gpx, "r", layer="track_points") as c:
    print(list(c))

Debug Output:

DEBUG:fiona.collection:Collection.__len__
DEBUG:fiona.ogrext:Session.get_length()
DEBUG:fiona.ogrext:length: -1
DEBUG:fiona.collection:Collection.__iter__
DEBUG:fiona.collection:lock sequential read
DEBUG:fiona.ogrext:Index: 0
DEBUG:fiona.collection:Collection.__len__
DEBUG:fiona.ogrext:Index: 1
DEBUG:fiona.ogrext:Index: 2
path_gpx = "/home/rene/dev/Fiona/tests/data/test_gpx.gpx"
with fiona.open(path_gpx, "r", layer="track_points") as c:
    list(c.__iter__())

Debug Output:

DEBUG:fiona.collection:Collection.__iter__
DEBUG:fiona.collection:lock sequential read
DEBUG:fiona.ogrext:Index: 0
DEBUG:fiona.ogrext:Index: 1
DEBUG:fiona.ogrext:Index: 2

I'm not familiar with the implementation of list() in Python. I assume in the first case collection has a len method and list() tries to use it to create a list more efficiently, while Iterator has no len. I'm not sure why list calls len the second time, though. The second len is a problem as the sequential read lock is already set. This can be circumvented by checking for the lock in len.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.1%) to 83.217% when pulling 645a6ac on rbuffat:try_lock_iteration into 929172f on Toblerity:maint-1.8.

@rbuffat
Copy link
Contributor Author

rbuffat commented May 7, 2020

#896 includes an outline of an alternative approach that does not require knowledge about the lifecycle of iterators.

E.g. for cases such as

def test_membership_releases_lock(path_coutwildrnp_json):
    with fiona.open(path_coutwildrnp_json) as c:
        assert (0 in c)
        list(c)

Where assert (0 in c) would create, but not release a lock, as it is not defined when an iterator is garbage collected (or not used anymore).

@rbuffat
Copy link
Contributor Author

rbuffat commented May 8, 2020

Closed in favor of #896

@rbuffat rbuffat closed this May 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants