Router API cannot connect to Mongo 2.6 #533

huwd · 2021-12-03T16:46:47Z

We've encountered a chaining problem when looking into how publishing-api tries to put things onto the rabbitMQ, which we've traced:
Publishing API -> Content Store -> Router API -> Router

The problem seems to be that router-api cannot find a server:

To replicate:

➜  router-api git:(main) govuk-docker-run bundle exec rails c
docker-compose -f [...] run router-api-lite bundle exec rails c
Creating govuk-docker_router-api-lite_run ... done
Loading development environment (Rails 6.0.3.7)
irb(main):001:0> Route.count
Traceback (most recent call last):
        1: from (irb):1
Mongo::Error::NoServerAvailable (No primary server is available in cluster: #<Cluster topology=Unknown[mongo-2.6:27017] servers=[#<Server address=mongo-2.6:27017 GHOST>]> with timeout=30, LT=0.015)

The mongo container does run, and you can watch logs though it is in a big loop of opening and closing connections punctuated by the following failry suspect message:

2021-12-03T16:06:57.809+0000 [rsStart] warning: getaddrinfo("48703775aaf0") failed: Name or service not known
2021-12-03T16:06:57.846+0000 [rsStart] getaddrinfo("48703775aaf0") failed: Name or service not known
2021-12-03T16:06:57.846+0000 [rsStart] replSet info Couldn't load config yet. Sleeping 20sec and will try again.

@kevindew spotted that if we comment out this line things start working again.

That seems to have been introduced during work to to resolve differences in how rs.status responds between mongo v.2.6 (which router runs in prod) and more modern versions.

#499

This may have been an attempt to resolve this issue: alphagov/router#210

Question to answer: what was L46 trying to resolve? Does it still serve that purpose? Can we replace it with something doesn't block local dev, or remove it altogether?

The text was updated successfully, but these errors were encountered:

karlbaker02 · 2021-12-03T16:59:53Z

L46 is necessary as we have been running MongoDB as a replica set since around April 2021, in order to enable the app to be replatformed. Previously, Router API knew about all running Router instances and would, upon a request to update a route, update said route and then call the /reload endpoint on each and every Router instance in order to ensure each instance's routes were up-to-date.

Replatforming changed this behaviour so that instead of Router API needing to know about individual Router instances (hardcoded instances, which was not translatable into the Kubernetes world into which we're now moving), Router instances would instead poll MongoDB for any new changes every few seconds; the way that we enabled this was through the use of a replica set and the db.stats() method to determine whether an instance has an up-to-date copy of the current routes from MongoDB by comparing the current optime to it's cached optime and reloading if changes have occurred.

huwd self-assigned this Dec 3, 2021

karlbaker02 mentioned this issue Dec 15, 2021

Update MongoDB to work for Router #549

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Router API cannot connect to Mongo 2.6 #533

Router API cannot connect to Mongo 2.6 #533

huwd commented Dec 3, 2021

karlbaker02 commented Dec 3, 2021 •

edited

Loading

Router API cannot connect to Mongo 2.6 #533

Router API cannot connect to Mongo 2.6 #533

Comments

huwd commented Dec 3, 2021

karlbaker02 commented Dec 3, 2021 • edited Loading

karlbaker02 commented Dec 3, 2021 •

edited

Loading