Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GMT_DATASET.to_dataframe: Return an empty DataFrame if a file contains no data #3131
GMT_DATASET.to_dataframe: Return an empty DataFrame if a file contains no data #3131
Changes from 6 commits
175ba3c
2e6e277
7482b25
3246e5c
ec59f9c
a2c48d5
1281ec0
b817e91
71cc9b7
065ec12
dbfc2ae
06790e2
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, it returns an empty DataFrame without columns and rows, but an empty DataFrame with columns is also allowed, e.g.,
I guess either is fine. I think we can use
return pd.DataFrame()
now and make changes if necessary.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return the column_names, so that users who want to e.g. do
pd.concat
on multiplepd.DataFrame
outputs from running an algorithm likepygmt.select
in a for-loop can do so in a more straightforward way. Note that we should also set the dtypes of the columns properly, even if the rows are empty, otherwise the dtypes will all becomeobject
:See my other suggestion at #3131 (comment) on not returning an empty
pd.DataFrame()
early, until the dtype is set withdf.astype(dtype)
below.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For reference, GMT provides two special/undocumented modules
read
andwrite
(their source codes aregmt/src/gmtread.c
/gmt/src/gmtwrite.c
) that can read a file into a GMT object (e.g, reading a tabular file as GMT_DATASET, or reading a grid as GMT_GRID). Currently, we're frequently using the specialread
module in the doctest of thepygmt.clib.session
module (similar to lines 46-50 below). We may want to make it public in the future as already done in GMT.jl (https://www.generic-mapping-tools.org/GMT.jl/dev/#GMT.gmtread-Tuple{String} and https://www.generic-mapping-tools.org/GMT.jl/dev/#GMT.gmtwrite).