Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filecoin Gossipsub Monitoring #1757

Open
yiannisbot opened this issue Jun 13, 2024 · 0 comments
Open

Filecoin Gossipsub Monitoring #1757

yiannisbot opened this issue Jun 13, 2024 · 0 comments
Assignees

Comments

@yiannisbot
Copy link

Open Grant Proposal: Filecoin Gossipsub Monitoring

Project Name: Filecoin Gossipsub Monitoring

Proposal Category: Developer and data tooling

Individual or Entity Name: ProbeLab (https://probelab.io/) / Interplanetary Shipyard (https://ipshipyard.com/)

Proposer: @yiannisbot

Project Repo(s) Please list Github repos used for this project work.

(Optional) Filecoin ecosystem affiliations:

The ProbeLab team has been part of Protocol Labs until January 2024.

(Optional) Technical Sponsor:

@smagdali

Do you agree to open source all work you do on behalf of this RFP under the MIT/Apache-2 dual-license?: Please respond with "Yes" or "No".

Yes.

Project Summary

This proposal is putting forward a plan to carry out a measurement study on the operation of Gossipsub for the Filecoin network. In particular, we’re proposing to build and deploy infrastructure to collect data from the protocol’s operation in the network and visualise it, in order to arrive to actionable results. The target is to get insights on a set of important metrics that can confirm the correct operation of the protocol and offer a direct, actionable plan forward to optimise protocol performance.

The project's focus is on developing the capability to measure the performance of the Gossipsub protocol in the Filecoin network by building the right tools and infrastructure for data collection, storage, analysis and visualisation.

Impact

libp2p’s Gossipsub protocol plays a very central role in the Filecoin blockchain. It is used for message and block propagation. That said, the correct operation of the protocol is of critical importance to the network. Although there haven’t been any reports of malfunction or problematic behaviour, there is no monitoring at all to confirm correct operation either. Given the ~$3B value that the Filecoin network carries, it’s imperative to keep this important piece of its networking layer secure, resilient and continuously optimize for performance. As a recent example, it is not clear whether the Peer Exchange mechanism of Gossipsub is working as expected [link], or if the protocol relies on the DHT only to find new peers when needed. But the important point is that there has been no monitoring to verify correct operation, or alerts to notify of the failure - it is still not clear what the case is.

Outcomes

See list of 10 metrics in Milestone 3. Individual deliverables are listed under each milestone as well as here below.

  • Deliverable 1: Public repository that includes the monitoring infrastructure to be deployed in the Filecoin network.
  • Deliverable 2: Public repository that includes the Gossipsub listener and tracer.
  • Deliverable 3: Plots for each metric.
  • Deliverable 4: One or more reports detailing the methodology and findings for each metric that includes plots (from Deliverable 3) and the corresponding description.

With this set of metrics and deliverables we will be able to answer questions, such as:

  • How long does it take for messages to propagate throughout the network?
  • What is the bandwidth requirement for control messages in the network?
  • What is the overhead of gossip messages in the network and what’s the effectiveness of the gossip mechanism as a whole?
  • Are misbehaving nodes excluded from the Gossipsub mesh as per the protocol’s score function?

The metrics that we are expecting to get out of this study will be included in one or several reports and optionally will also be continuously published at https://probelab.io/, depending on the infrastructure cost and maintenance commitment from ProbeLab and the FF. This is subject to a separate agreement that will come at the end of this project.

Adoption, Reach, and Growth Strategies

The target audience of this developer tooling project is the core maintainers of the Filecoin protocol and the developers of the Filecoin ecosystem. Storage Providers are also going to benefit from having deeper insights into the operation and correct functioning of Filecoin's Gossipsub network.

Development Roadmap

Milestone 1: Design development and implementation of monitoring architecture and data collection infrastructure

Delivery estimate (not effort estimate): 0.5 month

Description: This milestone focuses on the design of the monitoring infrastructure that is needed in order to plug into the right places in the network to collect data related to the operation of Gossipsub. Together with the monitoring infrastructure we will set up the database to collect the incoming data from the network.

This Milestone focuses on the infrastructure side of things and can move in parallel to Milestone 2, which focuses on the protocol side of things.

filecoin-infra

Deliverables: Public repository that includes the monitoring infrastructure to be deployed in the Filecoin network.

Milestone 2: Design and implementation of a GossipSub listener and tracer

Delivery estimate (not effort estimate): 1 month

Description: The target of this milestone is to integrate Filecoin and Gossipsub specifics into the specialised data collection infrastructure. For this reason, we will design and implement a GossipSub listener and tracer. The tool will subscribe to all relevant pubsub topics and will trace all protocol interactions, in order to be able to later filter and query the right items for the corresponding metrics.

As part of this milestone we will create a very lightweight Filecoin node that will be responsible for maintaining connections to multiple peers in the Filecoin network and collect data from them. We will also need to set up and run another Filecoin node that will be responsible for keeping up with the latest chain state and will also run tracecatcher [link], previously developed by this team and validate the correct operation of our “listener and tracer” tool developed as part of this Milestone.

The architecture proposed here won’t rely on Filecoin SPs to get the data, which is a big advantage compared to past approaches considered by this and other teams.

This Milestone focuses on the protocol side of things and can move in parallel to Milestone 1, which focuses on the infrastructure side of things.

filecoin-infra-2

Deliverable: Public repository that includes the Gossipsub listener and tracer.

Milestone 3: Data Analysis & Visualisation

Delivery estimate (not effort estimate): 1.5-2 months

Description: Once the architecture is correctly set up and can collect data from the network, this Milestone will focus on querying and filtering the data in order to get results for the right metrics. An initial list of metrics we will be focusing on is the following. These are subject to change as we get deeper into the details.

Metric 1: Session time duration and variability per implementation

  • Time difference between CONNECTED and REMOVE PEER.

Metric 2: Network Dynamicity

  • GRAFT and PRUNE frequency, as well as distribution and standard deviation.

Metric 3: Gossip Effectiveness

  • Count IWANT messages, as these show how many messages have propagated through gossip.

Metric 4: Number of messages per topic

  • Including number of Rejected messages and number of Duplicated messages.

Metric 5: Node bandwidth requirement

  • Aggregate bandwidth required over a day per node, or per topic

Metric 6: Message Propagation Latency

  • Throughout the whole network

Metric 7: Gossip Geographic Diversity [Low Priority - Nice to have]

  • Verify whether or not there is geographic correlation to nodes that receive messages though gossip.

Metric 8: Responsiveness to malicious peers [Low Priority - Nice to have]

  • Time to PRUNE when disseminating INVALID messages.

Metric 9: Score function variability [Low Priority - Nice to have]

  • Gather and plot the variability of score function parameters.

Metric 10: Peer Exchange behaviour [Low Priority - Nice to have]

  • Validate that Peer Exchange works as expected and verify that there is variability in the number of peers exchanged.

Deliverable: Plots for each metric.

Milestone 4: Recommendations & Final Report

Delivery estimate (not necessarily effort estimate): 0.5 month

Description: Coming out of the data analysis of Milestone 3 will be a report with insights and recommendations for next steps for the Filecoin developer team, or libp2p engineers themselves. To the extent possible, the ProbeLab team will be able to assist in either development of optimisations, or consultancy to other engineering teams.

Deliverable: One or more reports detailing the methodology and findings for each metric that includes plots (from Milestone 3) and the corresponding description.

Total Budget Requested

| Milestone # | Description | Deliverables | Completion Date | Funding |
| Milestone 1 | Monitoring and data collection infrastructure | Repository | 2-3 weeks | $22.5k |
| Milestone 2 | Gossipsub listener and tracer | Repository | 1 month | $45k |
| Milestone 3 | Metrics collection | Plots | 2 months | $90k |
| Milestone 4 | Recommendations & Final Report | 0.5 months | $22.5k |

Total budget requested: $180k

Maintenance and Upgrade Plans

It is suggested that the study is repeated at regular time intervals (at least once a quarter) to make sure that the protocol is operating correctly over the long term. Repeating the study will not incur the entire cost or require the full duration of the present project. It is expected that the metrics and final report can be reproduced within 1 month at a cost of $30k.

Running the infrastructure continuously is a separate option, but we do not have an estimate of the cost at this point. This option will be considered at the end of the present project.

Team

Team Members

  • Team Member 1: Yiannis Psaras, @yiannisbot (Team Lead)
  • Team Member 2: Guillaume Michel, @guillaumemichel (Software Engineer)
  • Team Member 3: Mikel Cortes, @cortze (Software Engineer)
  • Team Member 4: Steph Samson, @kasteph (Infrastructure Engineer)

Team Member LinkedIn Profiles

Yiannis Psaras LinkedIn profile
Guillaume Michel LinkedIn profile
Mikel Cortes LinkedIn profile

Team Website

https://probelab.io/

Relevant Experience

The ProbeLab team has been part of Protocol Labs for multiple years (until January 2024) and has been focusing on monitoring and measurement studies for IPFS and libp2p-based networks for several years. The team has extensive experience in building tooling for monitoring, measurement, as well as the relevant infrastructure. Apart from the several metrics and tools that the team is maintaining and can be found at https://probelab.io/, the team has carried out detailed studies for both IPFS and libp2p. These studies can be found at: https://github.com/probe-lab/network-measurements/tree/master/results.

The team is currently running a project to monitor the operation of Gossipsub in the Ethereum network. Here are two sample reports that have resulted as part of that project:

Last but not least, ProbeLab's Team Lead has been part of the team that redesigned Gossipsub to include security measures (namely the score function, flood publishing etc.) and author of the relevant reports:

Team code repositories

Additional Information

Contact email: yiannis@probelab.io

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants