Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ineffectual PR to add my commits messages back #562

Closed
wants to merge 11 commits into from

Conversation

matentzn
Copy link
Collaborator

@matentzn matentzn commented Nov 9, 2024

No description provided.

Warning: ChainedAssignmentError: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
When using the Copy-on-Write mode, such inplace method never works to update the original DataFrame or Series, because the intermediate object on which we are setting values always behaves as a copy.

I just to just not use "inplace" here and the warning does not happen.
Previously this warning happened:

FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
  df.replace("", np.nan, inplace=True)

setting the option stops Pandas from even trying to downcast (so it no longer has anything to warn about). I then force the downcast explicitly with infer_objects to ensure that the behaviour remains what it was, but not sure this is what I want here.
df.infer_objects(copy=False) does not work in a platform independent manner; and I dont even know if I need it
As @gouttegd suggested:

> I would rather put that somewhere when the library is initialized, so that we can be sure the option is enabled everywhere at all times and not only when the code path goes through this particular function.

I moved that code to __init__.py hopeing that is the right place for it.
To ensure the old behaviour is retained, @gouttegd convinced me to not be lazy and actually add the "df.infer_objects" functionality when replacing empty values with nan's. Should github still exist when you read this, you can find his feedback here: https://github.com/mapping-commons/sssom-py/pull/561/files#r1835148927

As he questions, it is not clear this is _strictly needed_, but its probably safer to so if we dont know if its needed.
The rationale is to not force this global option on all users of the library. The main point for the fix is so that sssom-py's CLI print to stdout functionality is not compromised by this output!
As explained by @gouttegd in #561 (comment), we do not actually benefit from the small performance gain enabled by copy=False here, so we decided to drop it (which means a considerable simplification of the code as well)
@matentzn matentzn requested a review from gouttegd November 9, 2024 15:29
Copy link
Contributor

@gouttegd gouttegd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a weird maneuver. Approved if you want, but frankly not convinced of the usefulness. It only makes the history even more complicated.

@matentzn
Copy link
Collaborator Author

matentzn commented Nov 9, 2024

Better not make things even worse than I already did..

@matentzn matentzn closed this Nov 9, 2024
@matentzn matentzn deleted the rm-no-warnings-no-squash branch November 9, 2024 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants