Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: reading with arrow returns empty GeoDataFrame when column referenced in where parameter is not included in results #388

Closed
brendan-ward opened this issue Apr 10, 2024 · 3 comments · Fixed by #391
Assignees
Labels
bug Something isn't working
Milestone

Comments

@brendan-ward
Copy link
Member

Observed with GDAL 3.8.3 on MacOS

from pyogrio import read_dataframe

filename = "pyogrio/tests/fixtures/naturalearth_lowres/naturalearth_lowres.shp"
df = read_dataframe(
    filename, where=""" "iso_a3" = 'CAN' """, use_arrow=True, columns=[]
)

yields

Empty GeoDataFrame
Columns: [geometry]
Index: []

when it should have one record.

Unclear if this is an error on our side our in GDAL.

@brendan-ward
Copy link
Member Author

Reported to GDAL #9655

@brendan-ward
Copy link
Member Author

Per further tests in GDAL #9655, the GDAL Python bindings are not giving the same results when not using the Arrow API as we are getting here. Those return 0 features when not using Arrow API, same as using the Arrow API.

In contrast here:

df = read_dataframe(
    filename, where=""" "iso_a3" = 'CAN' """,columns=["name"]
)

returns

     name                                           geometry
0  Canada  MULTIPOLYGON (((-122.84000 49.00000, -122.9742...

This suggests a possible error on our end, though I'm not yet sure how we'd get into a state where GDAL expects no features and yet we return some.


Per GDAL #9664, we should update our docs to indicate that it is not recommended to use where against columns not present in columns if both are provided.

@brendan-ward
Copy link
Member Author

Found our bug: we were setting the set of ignored fields after narrowing the list of fields down to those in columns, which meant that ignored fields were never set and we didn't pass those to GDAL.

Fix forthcoming...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant