Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Yjs Collab] Reliable sync with the backend #68483

Open
wants to merge 20 commits into
base: trunk
Choose a base branch
from
Open

Conversation

dmonad
Copy link

@dmonad dmonad commented Jan 3, 2025

What?

This PR implements a reliable sync protocol for collaborative editing.

Why?

Previously, Gutenberg implemented a basic Yjs editor binding. Changes were synced using y-webrtc when collaborative editing is enabled. However, the previous implementation did not reliably sync the backend-state with the Yjs document. The collaborative document was not preserved over different sessions, which might lead to content duplication and even loss of content.

Furthermore, even in non-collaborative sessions, it is quite easy to overwrite content from other users. Changes that are created by different users at the same time are not reconciled.

Related discussion on the considerations to improve the current sync approach: #65012

How?

This PR implements a reliable sync protocol for syncing backend state with a Yjs document.

  • We save the Yjs document in an HTML comment <!-- y:gutenberg .. -->, which is stored in the HTML document stored in WordPress
  • When the client notices that the Yjs document stored in the comment does not reflect the state of the HTML document, we assume that the changes were generated by the backend. We reconcile the changes and incorporate them into the Yjs document. Then we save the rebased Yjs document once again to the <!-- y:gutenberg --> comment and send it to the backend.
  • The changes from a backend are applied as a simulated "system user". This ensures that everyone applies the changes in the same way, to avoid content duplication.

Furthermore, this PR improves on the general user-experience of collaborative editing.

  • When collaborative editing is enabled, the autosave interval is set to 5 seconds.
  • The banners mentioning that there is a newer version of a document has been disabled in collaborative editing.
  • Don't sync unnecessary attributes like selection state.
  • Improved granularity of changes for better conflict resolution.

I documented the sync protocol in greater detail in packages/sync/CODE.md.

Testing Instructions

  • Follow the instructions to enable collaborative editing.
  • It should be possible to edit the same post in different browser windows. Changes made in one browser window will eventually show up in the other browser windows.
  • Conflicts are resolved automatically.
  • It should be possible to modify the HTML content directly. The collaborative clients will reconcile the changes and sync up.

Things to note

  • Syncing through y-webrtc has been disabled by default. After review, I'm concerned about security risks, because the access control (using a shared password) could be circumvented. We may want to improve the approach in a separate PR.

TODOs

  • add a separate experiments flag to enable y-webrtc with a password.

I hope that the sync approach implemented in this PR will be a good foundation for future work on making different parts of Gutenberg more collaborative. If we want to make Gutenberg / WordPress even more collaborative in the future, we need to share knowledge and make collaborative editing a priority for all future work. It is quite hard to add collaborative editing on-top of an existing code-base. Ideally, future work is built with collaborative editing in mind.

In this regard, I want to host a review & demo session on Monday, January 6 7pm UTC. This would be a good place to ask questions. Also, we can try to break this implementation and measure the overhead of decreasing the autosave interval.

Copy link

github-actions bot commented Jan 3, 2025

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Unlinked Accounts

The following contributors have not linked their GitHub and WordPress.org accounts: @dmonad.

Contributors, please read how to link your accounts to ensure your work is properly credited in WordPress releases.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Unlinked contributors: dmonad.

Co-authored-by: aaronjorbin <jorbin@git.wordpress.org>
Co-authored-by: szepeviktor <szepeviktor@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Copy link

github-actions bot commented Jan 3, 2025

👋 Thanks for your first Pull Request and for helping build the future of Gutenberg and WordPress, @dmonad! In case you missed it, we'd love to have you join us in our Slack community.

If you want to learn more about WordPress development in general, check out the Core Handbook full of helpful information.

@github-actions github-actions bot added the First-time Contributor Pull request opened by a first-time contributor to Gutenberg repository label Jan 3, 2025
@annezazu annezazu added Needs Technical Feedback Needs testing from a developer perspective. [Feature] Real-time Collaboration Phase 3 of the Gutenberg roadmap around real-time collaboration [Type] Technical Prototype Offers a technical exploration into an idea as an example of what's possible labels Jan 3, 2025
@szepeviktor
Copy link
Contributor

Copy link
Member

@aaronjorbin aaronjorbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start, thank you! A couple of questions and small suggestions on my first read-through.

I would like to see some automated tests before this gets too far, especially around places where others will extend this.

lib/experimental/synchronization.php Outdated Show resolved Hide resolved
lib/experimental/synchronization.php Outdated Show resolved Hide resolved
Comment on lines 148 to 150
// @todo the following regex doesn't catch the ygutenberg comment
// preg_match('/<!-- y:gutenberg version="([a-zA-Z0-9]*)" state="([a-zA-Z0-9+\/]*={0,3})" new-content-clientid="([0-9]*)" -->/', $postcontent, $yinfo);
preg_match('/<!-- y:gutenberg version=\"(.*)\" state=\"(.*)\" new-content-clientid=\"(.*)\" -->/', $postcontent, $yinfo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this is used both here and in filter_post_content_ydoc. I wonder if this should be abstracted into it's own get_yinfo function that can then have automated test to ensure it correctly handles any necessary edge cases.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the issue with the proper regex and abstracted it into a gutenberg_get_yinfo function.

Where would I locate tests for the sync package? I'm pretty sure that this would be easy to implement.

dmonad and others added 16 commits January 9, 2025 00:32
Previously, Gutenberg implemented a basic Yjs editor binding. Changes
were synced using y-webrtc when collaborative editing is enabled.
However, the previous implementation did not reliably sync the
backend-state with the Yjs document. Hence, the previous implementation
could result in content duplication, or loss of changes that were
created in the collaborative session.

This commit implements the initial approach to sync the backend document
(HTML encoded) with the Yjs document without the server understanding
Yjs.

Following things were implemented:

- We save the Yjs document in an HTML comment, that is stored in the
  HTML document stored in WordPress
- When the client notices that the Yjs document stored in the comment
  does not reflect the state of the HTML document, we assume that the
  changes were generated by the backend. Hence we update the document
  that is stored in the HTML comment, and merge the Yjs update to the
  Yjs document that is used in the collaborative session. Then we save
  the merged document once again to the backend.

The changes from a backend are applied as a simulated "system user".
This should guarantee that everyone applies the changes in the same way,
so we avoid content duplication. The
approach in this commit is still insufficient. We need to store
additional information in the backend to ensure that certain edge-cases
are handled.
Co-authored-by: Aaron Jorbin <aaronjorbin@users.noreply.github.com>
Co-authored-by: Aaron Jorbin <aaronjorbin@users.noreply.github.com>
As @aaronjorbin rightfully pointed out, the new filters approach don't
need to run at the end anymore.
@dmonad
Copy link
Author

dmonad commented Jan 9, 2025

I fixed all tests but the E2E Tests. Playwright - 4 seems to be failing because of a timeout issue. I can't figure out how to fix it, and would greatly appreciate help on this.

I also wanted to implement an integration test for a collaborative session. We would need to enable the experimental feature in the e2e test and then perform some actions on two different peers. But I would also need help to implement this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Feature] Real-time Collaboration Phase 3 of the Gutenberg roadmap around real-time collaboration First-time Contributor Pull request opened by a first-time contributor to Gutenberg repository Needs Technical Feedback Needs testing from a developer perspective. [Type] Technical Prototype Offers a technical exploration into an idea as an example of what's possible
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants