Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/feed #27

Merged
merged 9 commits into from
Aug 26, 2024
Merged

Feat/feed #27

merged 9 commits into from
Aug 26, 2024

Conversation

Nuhvi
Copy link
Collaborator

@Nuhvi Nuhvi commented Aug 20, 2024

For early review and testing with Pubky Nexus, this PR adds a feed of events (PUT and DELETE) for added or deleted files.

The API works as follows:

  1. Call GET <homeserver base URL>/events/ (note the trailing /)
  2. You should get a text/plain response, made out lines separated with \n char
  3. Last line contains the cursor to be used for the next request. You should be able to parse it by getting the last 13 characters in the body.
  4. Each other line starts with PUT or DEL followed by a space char then the pubky URL for the resource.
  5. You can use limit and cursor query params, no reverse.

The assumption, is that you will keep making requests until you get a response with header content-length = zero. Then you back off, and make the next request after some interval, if you don't, we will eventually add a rate-limiting anyways.

Things left to discuss:

  1. Should we return the cursor in an http header pubky-cursor instead of last line? I am trying to avoid JSON because I dislike it as a dependency, especially when it is not needed.
  2. If your backoff/interval is too long, for example minutes, then we might need to add a subscription API so that the server notify you once it has something new before that interval passes, but ... no urgent need for that I thing.
  3. What else?

This is a stopgap or at least an incomplete solutions, since it only works for realtime sync, but it fails in two situations:

  1. If you start a new Indexer from scratch, you will have to list A LOT of events to sync with a homeserver, most of these events are redundant (PUT event(s) that is later negated by DEL event), and even worse, homeservers will be expected to delete events older than X amount of days so that makes syncing incomplete.
  2. If a user migrates from one homeserver to another, you don't want the indexer to detect that as a bunch of new events, if it already seen all that information.

For both these cases, we simply have to use set-reconciliation, and for that, we need a Merkle tree of sorts.

@Nuhvi Nuhvi requested a review from SHAcollision August 20, 2024 19:05
@Nuhvi Nuhvi mentioned this pull request Aug 26, 2024
Copy link
Contributor

@SHAcollision SHAcollision left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be merged as currently is! It is fully functional.

I will create issues as needs get discovered :)

@Nuhvi Nuhvi marked this pull request as ready for review August 26, 2024 15:45
@Nuhvi Nuhvi merged commit 9131cef into main Aug 26, 2024
1 check passed
@Nuhvi Nuhvi deleted the feat/feed branch November 15, 2024 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants