Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(python): Rewrite pivot and melt docstrings #13693

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

Wainberg
Copy link
Contributor

pivot and merge are among the more complicated dataframe manipulation functions, so I thought I'd improve the documentation on them.

@github-actions github-actions bot added documentation Improvements or additions to documentation python Related to Python Polars labels Jan 13, 2024
@stinodego stinodego changed the title docs(python): rewrite pivot and merge docs docs(python): Rewrite pivot and melt docstrings Jan 14, 2024
@Wainberg
Copy link
Contributor Author

@stinodego love to get your thoughts on this one. I guess this can be the first "chunk" of the documentation rewrite strategy we were envisioning.

@Wainberg
Copy link
Contributor Author

@stinodego I wonder if another core developer might prefer to take a look at this and future documentation PRs?

@MarcoGorelli
Copy link
Collaborator

i'll take a look, but it would be good to get #14048 in first

@Wainberg
Copy link
Contributor Author

Wainberg commented Feb 1, 2024

Thanks so much @MarcoGorelli! Feel free to edit my edits to reflect your new pivot behavior - you should have commit privileges on this branch.

Copy link
Member

@stinodego stinodego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is good information in here, but it's much too verbose for a docstring. Docstrings should be a one-line summary, maybe two or three sentences of extended summary, and then some examples with a one-sentence explanation.

We actually have a melt and pivot section in the user guide, and there currently is not much information in there. The information would be a welcome addition there.

So I would propose to update the user guide sections and keep only the most essential information the docstring. I can have another look then.

@Wainberg
Copy link
Contributor Author

@stinodego Good idea! I moved most of the info to the user guide. I also updated my changes to account for @MarcoGorelli's fixes in #14048. Note that my changes to the user guide will need to be updated for 1.0 to account for #14491.

@Wainberg
Copy link
Contributor Author

Wainberg commented Mar 3, 2024

@MarcoGorelli looks good? Once you're happy with it, I'll update to account for #14477 and fix the merge conflict.

@MarcoGorelli
Copy link
Collaborator

Overall, I think this is an improvement - in particular, the examples you provide for pivot make for easier teaching material than the current ones. Nice one @Wainberg 🙌 ! If you fetch and merge upstream/main then I'll check all the details

@Wainberg
Copy link
Contributor Author

Wainberg commented Mar 5, 2024

@MarcoGorelli thanks so much! ready for you to take a look :)

Copy link

codecov bot commented Mar 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.00%. Comparing base (baacf3d) to head (1ad84bf).
Report is 29 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13693      +/-   ##
==========================================
+ Coverage   80.94%   81.00%   +0.05%     
==========================================
  Files        1327     1331       +4     
  Lines      172081   172786     +705     
  Branches     2453     2456       +3     
==========================================
+ Hits       139290   139964     +674     
- Misses      32320    32354      +34     
+ Partials      471      468       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only really got 3 minor comments, it's clear you've spent quite some time looking into pivot and melt. LMK if you don't have bandwidth to address them

Thanks for doing this, it'll help users!

Comment on lines 26 to 28
Thus, if there are `N` unique values in the `columns` column, there will be
`N + 1` columns in the pivoted `DataFrame`: one for the row names, the
remaining `N` for the values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is only true if there's a single index column? Which is fine, you explain what happens with multiple index columns below, I just think it needs caveating here

know these unique values in advance, you can perform a "lazy pivot", as shown in
the "Examples" section below.

Warnings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "Notes" rather than "Warnings"?

{'min', 'max', 'first', 'last', 'sum', 'mean', 'median', 'len'}
- An expression to do the aggregation.
A function to aggregate multiple `values` with the same `index` and
`columns` prior to pivoting, equivalent to using :func:`group_by` as a
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does group_by render correctly here? you may need to use

:meth:`DataFrame.group_by`

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

py-polars/polars/dataframe/frame.py Show resolved Hide resolved
@Wainberg
Copy link
Contributor Author

Wainberg commented Mar 7, 2024

Thanks so much @MarcoGorelli! I incorporated your suggestions.

I'm glad you appreciate the amount of work that went into this. I put a similar amount of care into #13402; it would be great if at least some of these changes could be incorporated at some point.

Copy link
Collaborator

@MarcoGorelli MarcoGorelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some little fixes, as there were some things which still needed sorting out

In particular, the last example from pivot didn't render

With the regards to the other PR - may I offer some unsolicited advice?

  • as you can see, reviews generally take a lot of effort. It's easy to miss things. That's why you're being asked to only focus on a single topic per PR, reviewing 74 docs changes in one go is practically impossible
  • I'd really advise against tagging so many people in one message...that would only really be OK if you were reporting an absolutely critical issue which simply couldn't wait...
  • I'd suggest merging conflicts before asking for reviews, else content can get out of date (and another review will be needed after you've merged conflicts anyway)

Having said that, I really appreciate the work you've done here!

docs/user-guide/transformations/pivot.md Outdated Show resolved Hide resolved
docs/user-guide/transformations/pivot.md Show resolved Hide resolved
py-polars/polars/dataframe/frame.py Outdated Show resolved Hide resolved
py-polars/polars/dataframe/frame.py Outdated Show resolved Hide resolved
py-polars/polars/lazyframe/frame.py Outdated Show resolved Hide resolved
@Wainberg
Copy link
Contributor Author

Wainberg commented Mar 7, 2024

@MarcoGorelli thanks for the edits!

Re: the other PR, it would probably be best at this point if the core developers take that PR as a general guideline for things that could be improved in the docs, and manually incorporate whichever changes they like on their own timeframe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants