-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table of Contents: Try storing in post-meta #54224
Conversation
This pull request has changed or added PHP files. Please confirm whether these changes need to be synced to WordPress Core, and therefore featured in the next release of WordPress. If so, it is recommended to create a new Trac ticket and submit a pull request to the WordPress Core Github repository soon after this pull request is merged. If you're unsure, you can always ask for help in the #core-editor channel in WordPress Slack. Thank you! ❤️ View changed files❔ phpunit/blocks/render-block-table-of-contents-test.php ❔ lib/blocks.php |
Size Change: +63 B (0%) Total Size: 1.52 MB
ℹ️ View Unchanged
|
a0f7014
to
21f711f
Compare
cc @WordPress/gutenberg-core |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this work 👍 Looks solid at a first glance!
My biggest concern is around the alternative scenarios for editing a post's content, and whether in those scenarios we'll maintain an up-to-date ToC post meta with the most recent headings in the content. WDYT?
|
||
if ( ! empty( $heading['link'] ) ) { | ||
$pagelink = ! empty( $heading['page'] ) ? add_query_arg( 'page', $heading['page'], $permalink ) : $permalink; | ||
$content = '<a class="wp-block-table-of-contents__entry" href="' . esc_url( $pagelink . $heading['link'] ) . '">' . esc_html( $heading['content'] ) . '</a>'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way we're constructing the link $pagelink . $heading['link']
seems a bit fragile. Should we add some validation that $heading['link']
is correct just in case (for example, has no leading/trailing space, starts with #
and contains at least one alphanumeric character afterwards)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's an autogenerated value and has validation on the client side. Usually, blocks don't revalidate passed values, only checking if a value exists.
get_block_wrapper_attributes(), | ||
block_core_table_of_contents_build_list( $tree ) | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are various ways to edit the content of a post through the API, like using the REST API for example (POST to a /wp/v2/posts/<id>
), or a raw wp_update_post()
call from a plugin. There the user can provide content
(or post_content
) with changes to the headings, and from what I understand, these changes won't be reflected in the relevant post meta field. Is that something that's being considered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same argument could be made for the Footnotes block. It's hard to do anything for similar updates when observers are on the client side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely don't want to block this PR, however, to be fair, this isn't a good reason IMHO. This only means that we need to consider the same scenarios for the Footnotes block.
Updating a post through:
- REST API
- WP CLI
save_post
hook andwp_update_post()
and family- SQL query
is a valid post-updating scenario and we should not be skipping any of them. Otherwise, we'll the metadata can end up being obsolete, and I'd bet we don't want that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My preferences has been saving it in the post content, however @youknowriad and @mtias pointed out that would make it much harder to render them outside of the post content in a template. That was the reason for moving them to post meta. Honestly, if you want to save a post with footnotes through another editor, it will have to know how to handle, insert, order footnotes anyway. ToC block may be different since it’s purely a derivative of the content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ToC takes post content as input only? Why does it even need post meta? Is that just for caching? Maybe invalidate the cache on save?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are other blocks with headings? Maybe use the tag processor to just get all the headings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are other blocks with headings? Maybe use the tag processor to just get all the headings?
I don't have a historical context on this one, but I think it was a safer assumption that Heading blocks will always be a part of the content structure while other headings that are part of blocks aren't. cc @ZebulanStanphill
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would make it much harder to render them outside of the post content in a template
why is this? if the table of contents is a block it seems like we could quickly grab it with parse_blocks()
and handle it elsewhere. that should be faster than making a DB call.
would the [HTML] API be suitable for this use case?
yes and no! I'd love to explore this together. it depends on what we're wanting to do, specifically. if it's simply to discover every H1
through H6
and link to them in order, that should be easy. if it's looking more in depth to determine what should be a header, who knows. let's figure it out.
the HTML Processor is not ready yet for it because it's currently lacking support for list elements and other common tags, and it bails when it encounters unsupported elements, but the Tag Processor can help.
the Tag Processor could even build a map of all id
values that are linked within the page with a corresponding <a href="#{id}">
if it mattered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are other blocks with headings? Maybe use the tag processor to just get all the headings?
I don't have a historical context on this one, but I think it was a safer assumption that Heading blocks will always be a part of the content structure while other headings that are part of blocks aren't. cc @ZebulanStanphill
Actually, here is the reasoning that led to the current implementation lacking support for headings outside of Heading blocks: #29560
I think the idea was that eventually we'd add a dedicated API for handling document heading structure, which would be abstracted from the actual blocks/content used. Combined with the complexities of parsing HTML (my implementation back then relied on libxml), it seemed easier to just support Heading blocks at the time.
Of course, the abstract headings API remains an unimplemented concept, and now we have the HTML Tag Processor to make parsing HTML easier.
With regard to "what should count as a heading", I think it's fairly simple, conceptually: any h2-h6 tag inside the post content is a heading, unless it is inside a nested <article>
or <aside>
. This would align the most with actual HTML semantics. (As a side note, an <h1>
in the content would be ignored since that should only ever be used for the post title, which would be excluded from the table of contents anyway.)
Perhaps a slightly more difficult question would be "how do we determine where the post content is". Presumably, we could just look for the parent <article>
to determine the boundaries of where to look for heading elements, or just look at the entire rendered document if on an old theme that doesn't use HTML5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Tag Processor should be able to find all H1-H6 elements, but it's not technically capable of knowing if they are inside a nested article or aside. it can be used to approximate that though in the meantime until the HTML Processor is ready for that.
places where it could get off track are not only when we have missing closing tags, but also when we encounter boundaries like the TEMPLATE element, tables, and others, which implicitly close tags that might still be open.
This comment was marked as resolved.
This comment was marked as resolved.
I was suspicious of this move but then reviewing how it works and noting we've done the same with footnotes it's 🎉 Most of my comments would be the same with @tyxla in terms of naming etc.
One would think given the blocks architecture a rest endpoint would generate the structure on the fly. Nevertheless it is interesting, though, that one would not expect that the full blown JS app we're runniing in the browser, benefitting of the whole blocks architecture, cannot handle the structure of the currently edited document, and for a measly undo issue we have to move to the serverside. The question is wether we're not crucially sidestepping something solveable (really just asking didn't explore solutions) by building yet another core block that depends on WordPress to work. Stale permalinks are a pervasive problem across how WordPress operates its block editing. Again, instead of figuring out the way to handle dynamic content we move to make blocks dynamic - which solves the problem but in a very WordPress-y way, not a Gutenberg-y one. Which is fine, but we should really start splitting that block library package in WP Dependent vs WP Agnostic blocks. |
Co-authored-by: Marin Atanasov <8436925+tyxla@users.noreply.github.com>
2aa5d26
to
c79c8c3
Compare
Thanks for the concern, @fabiankaegy! The Those are good questions, @draganescu. Thanks for voicing them! We need a better way to work with derived values in the editor. The |
Flaky tests detected in 825578e. 🔍 Workflow run URL: https://github.com/WordPress/gutenberg/actions/runs/6169745903
|
Thanks for the detailed review, @tyxla!
Are there any other blocks besides this issue? The block is still experimental and won't automatically ship with the following WP release. I could do a follow-up once there's a clear requirement for the ToC/Footnotes block and syncing values when content is programmatically updated. |
Although it's a big one, I can't think of any other blocker, and I'd be happy to have it addressed in a follow-up, as long as this block remains experimental. Before moving forward though, let's take a step back and evaluate all other alternatives. I understand that we can't parse the blocks on the server side (is that confirmed, could you share a bit more details about it?), but there might be other opportunities to grab all the headings from the content when needed. The idea of using the HTML API could be one alternative to explore. I just think we should try out everything before creating a second source of truth that we'll need extra work to maintain and keep in sync. |
@aaronrobertshaw @glendaviesnz Just to note that this is conceptually very similar to what we want to achieve for partially synced patterns in #53705. We both want to propagate/derive updates from other blocks to another block's attributes. Perhaps we can draw some inspirations from here when we go back to work on it. |
Hey @Mamaduka, what do you mean by not automatically ship? Will it be opt-in in 6.4? |
@priethor, blocks marked as experiments aren't included in WP core builds. The ToC will stay experimental until we resolve the main issues. |
Maybe a look in already working TOC plugins will lead to some inspiration: Or generating TOC on the fly when building the webpage: |
What?
Part of #42229.
Resolves #41031.
Resolves #43595.
PR changes the storing mechanism for ToC block headings from attribute to post meta. This opens up the possibility of rendering the ToC block outside of post content (see #41173) and fixes a block's "undo quirk" issue.
The dynamic rendering allows us to skip storing permalinks in DB, which resolves #43595 and should make migrating content with the ToC block easier. You don't need to replace domains in serialized content.
Note
I've intentionally omitted post-meta revision handling from this PR. I hope that WP#4859 will also land in WP 6.4.
Why?
How?
undoIgnore
option flag.Testing Instructions
Screenshots or screencast
Before
CleanShot.2023-09-06.at.21.45.29.mp4
After
CleanShot.2023-09-06.at.21.46.56.mp4