Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gossip: more reliable publishing #387

Merged
merged 9 commits into from
Oct 22, 2024
Merged

Gossip: more reliable publishing #387

merged 9 commits into from
Oct 22, 2024

Conversation

Nashatyrev
Copy link
Collaborator

@Nashatyrev Nashatyrev commented Oct 18, 2024

The quote from the Gossip spec:

  • If the router is subscribed to the topic, it will send the message to all peers in mesh[topic].
  • If the router is not subscribed to the topic ...

According to spec we should forward the message we are publishing to the mesh peers only.

However the mesh could be not that stable (for example a peer may be evicted from mesh due to slight gossip downscorring (any score <0)). Thus it could lead to the situation, when the mesh is empty but there are still peers subscribed to the topic outside of the mesh. According to the spec we are not able to publish.

The situation could not be that dramatic due to IHAVE/IWANT mechanism which should finally make the message published, but that could induce significant publish latency.

To overcome this it does make sense to take extra peers outside of the mesh when the mesh size is < D.

This is also the subject for a PR to the Gossip spec

@tbenr
Copy link
Collaborator

tbenr commented Oct 19, 2024

Can't find the definition of "direct peer" in the spec, but per my mental model two direct peers mutually configured as "direct peer" always exchange messages among each other.
If we always send messages direct peers when dealing with broadcastInbound, I don't see why we shouldn't do the same when publishing locally produced messages via broadcastOutbound.

Then I also wonder why direct peers are blindly considered for outbound even if they don't subscribe to the specific topic (but this is another question :-), and maybe it's fine)

Copy link
Collaborator

@tbenr tbenr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@StefanBratanov StefanBratanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM to me as well

if (topicMeshPeers != null) {
// we are subscribed to the topic
val addFromNonMeshCount = max(0, params.D - topicMeshPeers.size)
val nonMeshTopicPeers = (getTopicPeers(topic) - topicMeshPeers)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about the score of the topic peers in this case?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about the score of the topic peers in this case?

I think the scoring is already "captured" by the mesh. Outside that, seems reasonable to me to pick random peers.

On Friday I had a similar comment suggesting "why don't we select by sorting them by score" and Anton had a good reply to it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good question. Here is the quotation from the spec:

PublishThreshold: when a peer's score drops below this threshold, self-published messages are not propagated towards this peer when (flood) publishing. This threshold should be negative, and less than or equal to the gossip threshold.

I believe this mean that both 'flood' and 'regular' publishing should consider this threshold.
It's probably counter eclipse attack measure.
So I believe we still need to truncate those with the score < PublishThreshold but still consider them when there are few peers to publish

Does it make sense?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that seems like a convincing interpretation to me.

@Nashatyrev
Copy link
Collaborator Author

Can't find the definition of "direct peer" in the spec, but per my mental model two direct peers mutually configured as "direct peer" always exchange messages among each other. If we always send messages direct peers when dealing with broadcastInbound, I don't see why we shouldn't do the same when publishing locally produced messages via broadcastOutbound.

Then I also wonder why direct peers are blindly considered for outbound even if they don't subscribe to the specific topic (but this is another question :-), and maybe it's fine)

@tbenr you are right. There is a mess with direct peers. Just created a separate issue for this: #389

@Nashatyrev
Copy link
Collaborator Author

Addressed the peer score question when publishing.
@StefanBratanov @tbenr could you please take another look please?

Copy link
Collaborator

@tbenr tbenr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


test.mockRouters.forEach {
it.subscribe(topic)
// it.sendToSingle(createGraftMessage(topic))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can delete this :-)

@Nashatyrev Nashatyrev merged commit 1cde874 into develop Oct 22, 2024
4 checks passed
@Nashatyrev Nashatyrev deleted the publish-to-d-peers branch October 24, 2024 12:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants