Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug (hubble): Hub out of sync #2332

Closed
flyingrabbit-lab opened this issue Sep 25, 2024 · 10 comments
Closed

bug (hubble): Hub out of sync #2332

flyingrabbit-lab opened this issue Sep 25, 2024 · 10 comments
Labels
s-noop Cannot be worked on

Comments

@flyingrabbit-lab
Copy link

What is the bug?
My hub has missing messages although grafana shows 100% synced.
I have tried everything on the troubleshooting page.
Random messages (cast, reactions) are missing.

This is my grafana dashboard:
image
image
image
image

Reseting the db works, but then after some time it has missing messages again.

@flyingrabbit-lab flyingrabbit-lab changed the title bug (hubble): bug (hubble): Hub out of sync Sep 25, 2024
@github-actions github-actions bot added the s-triage Needs to be reviewed, designed and prioritized label Sep 25, 2024
@AX1S99
Copy link

AX1S99 commented Oct 1, 2024

i think it is because of logs. the node stores 2 gp logs. so when it is showing 100%, may be it is considering the success logs

@flyingrabbit-lab
Copy link
Author

i think it is because of logs. the node stores 2 gp logs. so when it is showing 100%, may be it is considering the success logs

What can I do about it, to not be out os sync?

@Z3R013x
Copy link

Z3R013x commented Oct 3, 2024

Exactly same bug on my side too... it was working just fine, few days ago I had to reinstall OS on server, after installing node via script that runs it via docker, this annoying bug occurs, tried resetting DB, pruning docker & all volumes and install node again, reinstalled OS again, nothing works :/ even on fresh install I get 1-2 peers, after some time I get 0 peers and no gossips.

@alexchenzl
Copy link

Exactly same bug on my side too... it was working just fine, few days ago I had to reinstall OS on server, after installing node via script that runs it via docker, this annoying bug occurs, tried resetting DB, pruning docker & all volumes and install node again, reinstalled OS again, nothing works :/ even on fresh install I get 1-2 peers, after some time I get 0 peers and no gossips.

Same issue happens on my server.

@0x330a
Copy link
Contributor

0x330a commented Nov 7, 2024

Same thing happens to me. logs show errors when trying to connect to all bootstrap peers

@sds
Copy link
Member

sds commented Nov 7, 2024

Could you share some example error messages?

In the original screenshot for this issue (the middle of the Grafana dashboard) it shows there were no inbound gossip connections. You can't discover peers if they can't communicate with you.

This is likely an error with your local or cloud provider networking, as we are not seeing any issues with our production hubs.

Also, please make sure you are running the latest hub version.

@0x330a
Copy link
Contributor

0x330a commented Nov 7, 2024

running 1.16.2 trying to figure out what's going wrong or if it just needs more time.

bootstrap peers can be discovered and sometimes connected to but never receive inbound connections (probably fine) and after a few hours the connected peers seems to drop to 0, all message gossip seems to stop after that point.

the log messages also say that it doesn't run the sync health job because the first sync hasn't been completed.

Going to leave it running to see if more time helps

@sds sds added s-noop Cannot be worked on and removed s-triage Needs to be reviewed, designed and prioritized labels Nov 7, 2024
@sds
Copy link
Member

sds commented Nov 7, 2024

If the peers can't connect to you they can't sync. Incoming gossip not working (Grafana dashboard error above) indicates that your hub is not reachable on the public internet, likely because it's behind a NAT gateway.

We highly recommend avoiding running a hub yourself and using a provider (such as Neynar) to expose a hub API for you.

I'm going to close this, but if you have more information that changes the story, let's open a new ticket and discuss there. Thank you!

@sds sds closed this as completed Nov 7, 2024
@Z3R013x
Copy link

Z3R013x commented Nov 7, 2024

TCP ports required for server is reachable, can reach open ports externally without any issue, new nodes are experiencing given issue, mine was also working fine before I reinstalled it, that's why you don't see issue on your production hubs.

@sds

@sds
Copy link
Member

sds commented Nov 7, 2024

I just redeployed our production hubs. We're not seeing this issue. There must be something specific about the way you're running your hubs.

Remember, both ports 2282 and 2283 need to be reachable from the public internet. It's possible your provider might be blocking traffic, or something else unexpected and outsider of Hubble is occurring.

If you'd like to receive more support on this, open a new ticket and:

  • Provide clear error logs demonstrating the problem
  • Provide proof that your hub is reachable externally (you can send me an IP address via direct cast on Warpcast if you like so I can independently verify it's open)
  • Include the configuration you are using. Are you using the hubble.sh script? Which OS are you using?

We strongly suggest running hubs using a provider like Neynar for just this reason. Hubs are sometimes challenging to run, and given our current efforts to migrate to Snapchain we likely aren't going to be making much investment on the current hub implementation until after that migration is complete, since it solves many known problems (mostly performance-related) with sync.

I'm going to lock this thread. Thank you for your understanding.

@farcasterxyz farcasterxyz locked as resolved and limited conversation to collaborators Nov 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
s-noop Cannot be worked on
Projects
None yet
Development

No branches or pull requests

6 participants