Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): Right-align numerical headers in write_excel #17877

Closed
wants to merge 2 commits into from

Conversation

mcrumiller
Copy link
Contributor

Resolves #17852.

@alexander-beedie the logic is the following:

  • If a header is specified with an alignment, no logic is applied (respect user header format).
  • If a header is specified without an alignment, or no header logic is specified, then numerical columns are right-aligned.
  • If the table autofilter is enabled, then an indent is applied so that the headers are not hidden.

Here is my test to ensure it's working:

import polars as pl

data = {
    "id": ["a123", "b345", "c567", "d789"],
    "a": [99, 45, 50, 85],
    "b": [1.2, -3.4, 5.6, 7.8],
    "c": [1.2, -3.4, 5.6, 7.8],
    "d": [1.2, -3.4, 5.6, 7.8],
}

column_formats = {
    "b": {
        "font": "Consolas",
        "align": "center",
        "num_format": "0.000;-0.000;0",
        "valign": "top",
    },
    "c": {
        "num_format": "$#,##0.00;[red]-$#,##0.00;-",
        "font": "Consolas",
        "font_size": 10,
        "valign": "top",
    },
    "d": {
        "num_format": "$#,##0.00;[red]-$#,##0.00;-",
        "font": "Consolas",
        "font_size": 10,
        "valign": "top",
    },
}

header_format = {
    "bg_color": "#FFFFEE",
    "font_color": "blue",
}

# Write header with specified alignment. Header should not include new logic.
header_format["align"] = "left"
pl.DataFrame(data).write_excel(
    "output_left_align.xlsx",
    table_style={"style": "Table Style Medium 15"},
    column_formats=column_formats,
    header_format=header_format,
)

# Write header with no specified alignment. Header should right-align with indent.
header_format.pop("align")
pl.DataFrame(data).write_excel(
    "output_no_align.xlsx",
    table_style={"style": "Table Style Medium 15"},
    column_formats=column_formats,
    header_format=header_format,
)

# Write header no header formatting. Header should right-align with indent.
header_format["align"] = "left"
pl.DataFrame(data).write_excel(
    "output_no_fmt.xlsx",
    table_style={"style": "Table Style Medium 15"},
    column_formats=column_formats,
    header_format=None,
)

# Write with no formatting and autofilter off. Header should right-align without an indent.
header_format["align"] = "left"
pl.DataFrame(data).write_excel(
    "output_no_filter.xlsx",
    table_style={"style": "Table Style Medium 15"},
    column_formats=column_formats,
    header_format=None,
    autofilter=False,
)

Output:

Header with specified alignment. Header should not include new logic.

image

Header with no specified alignment. Header should right-align with indent.

image

No header formatting specified. Header should right-align with indent.

image

No header formatting and autofilter off. Header should right-align without an indent.

image

@github-actions github-actions bot added enhancement New feature or an improvement of an existing feature python Related to Python Polars labels Jul 25, 2024
@mcrumiller mcrumiller marked this pull request as ready for review July 25, 2024 17:55
Copy link

codecov bot commented Jul 25, 2024

Codecov Report

Attention: Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 80.48%. Comparing base (2a6ebec) to head (c7cc745).

Files Patch % Lines
py-polars/polars/io/spreadsheet/_write_utils.py 90.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #17877   +/-   ##
=======================================
  Coverage   80.48%   80.48%           
=======================================
  Files        1507     1507           
  Lines      197610   197619    +9     
  Branches     2813     2815    +2     
=======================================
+ Hits       159040   159051   +11     
+ Misses      38049    38046    -3     
- Partials      521      522    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@MarkRotchell
Copy link

MarkRotchell commented Aug 2, 2024

I came searching for an answer to this question and found your issue/PR, I hope you don't mind me contributing some thoughts.

I'm not necessarily a fan of using indents automatically if autofilter is applied. It would make things less easy to read in situations like the below:

Without Indent

image

With indent

image

Perhaps better to increase row height so that header text is above the autfilter buttons, rather than use indents to push them to the left?

I also think it would be good to add an additional level to your hiearchy to check whether the user has specified an alignment in the column_formats and then dtype_formats before assuming they want all their numerical columns right aligned. (I have worked for bosses that insist everything is center aligned despite my protests). Any alignments specified in those should be inherited by the header (including "center", which is often used sensibly for small categorical columns like flags), before using default alignments.

Ulitimately, I'd love to give users the ability to format each column of the header themselves individually, similar to how they can with column_formats.

@mcrumiller
Copy link
Contributor Author

mcrumiller commented Aug 2, 2024

@MarkRotchell you're right--column alignment and dtype alignment, if specified, should take precedence over the default logic here of right-alignment.

In your screenshot, I agree that in the contrived example the indent doesn't look great. I would bet, though, that in most cases the indentation will look better.

Copy link
Collaborator

@alexander-beedie alexander-beedie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ulitimately, I'd love to give users the ability to format each column of the header themselves individually, similar to how they can with column_formats.

@MarkRotchell: I'm also finding it fairly easy to create tables that don't look quite right, so I think I agree with you here 🤔

@mcrumiller I think the way to go is not to change the default, but to extend the header_format param so that it can do more; specifically, we could support something like {cs.numeric(): {"align": "right"}}. This would also open up additional formatting possibilities. Can take look at the other params that allow similar capabilities. Want to give it a go? ✌️😎 (apologies for the delay on this review!)

@mcrumiller
Copy link
Contributor Author

I may be able to get to this this weekend, sorry been busy with real job...

I'll close this for now since the new proposal would be a significant deviation in spirit from this PR.

It's too bad that Excel chooses to hide text behind the drop-down arrow, otherwise this would be cake to simply apply the column alignment to the header.

@alexander-beedie
Copy link
Collaborator

alexander-beedie commented Aug 7, 2024

I may be able to get to this this weekend, sorry been busy with real job...

No apologies required, lol... your contributions are always welcome irrespective of time/frequency, etc. It's your free time after all; choosing to spend it helping us out will always be appreciated, so if you want to have another run at it, feel free to do so whenever it works for you! 🫡

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or an improvement of an existing feature python Related to Python Polars
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Right-align numerical headers in write_excel
3 participants