Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: Custom properties for tuning for Formplayer service Hikari DB connection pool #6256

Merged
merged 1 commit into from
Apr 10, 2024

Conversation

snopoke
Copy link
Contributor

@snopoke snopoke commented Apr 9, 2024

Manual review of long-request timings from APM surfaced very out-of-spec request timing for what should be a nearly constant time cache check

Further investigation confirmed that the actual db lookups are completing in expected latencies (~10ms) but that in certain load conditions the db request pool is starving prematurely before server load is affected (likely exacerbated by occasional bursty request load per machine, note the asymmetry in backlogging below).

This change increases the availability of the connection pool at peak and reduces reserved connections. After deployment we will monitor the same metrics to confirm that we are no longer developing backlogs for low-latency timings.

This change won't impact any average baseline request timings, but should eliminate a significant source of variance in otherwise performant requests.

Further details in change request: https://dimagi.atlassian.net/browse/SAAS-15447

New DB connection pool configuration for production:

  • maximum-pool-size
    • Change from the default of 10 to 20 to better handle peak usage
  • minimum-idle
    • Change from the default of 10 to 5 to reduce off peak DB connections and connections from instances receiving less requests

Config options docs: https://github.com/brettwooldridge/HikariCP?tab=readme-ov-file#gear-configuration-knobs-baby

Environments Affected

Production

@gherceg
Copy link
Contributor

gherceg commented Apr 9, 2024

Can you talk more about what sparked the decrease in minimum-idle? Based on docs:

for maximum performance and responsiveness to spike demands, we recommend not setting this value and instead allowing HikariCP to act as a fixed size connection pool. Default: same as maximumPoolSize

I can understand that it is a waste to maintain unused db connections when usage is lower, but if the goal is to better handle peak usage, why decrease the minimum number of idle connections?

@snopoke
Copy link
Contributor Author

snopoke commented Apr 9, 2024

Can you talk more about what sparked the decrease in minimum-idle? Based on docs:

for maximum performance and responsiveness to spike demands, we recommend not setting this value and instead allowing HikariCP to act as a fixed size connection pool. Default: same as maximumPoolSize

I can understand that it is a waste to maintain unused db connections when usage is lower, but if the goal is to better handle peak usage, why decrease the minimum number of idle connections?

The guidance is aimed at applications requiring very low latency. For Formplayer I do not expect the overhead of creating new connections on occasion to be an issue. New connections will remain in the pool until they have been idle for 10 minutes so during peak there should only be 10 requests that bear the cost of creating a new connection.

I'm also cautious to increase the total number of connections across all the 20 FP machines in prod. The traffic is not spread evenly across all nodes so it doesn't make sense to have all of them open max connections which may impact the DB's performance since each connections get's memory assigned to it.

@snopoke snopoke merged commit a28c95f into master Apr 10, 2024
2 checks passed
@snopoke snopoke deleted the sk/formplayer-custom-props branch April 10, 2024 07:51
@ctsims ctsims changed the title add custom properties to fomplayer config for DB connection pool Performance: Custom properties for tuning for Formplayer service Hikari DB connection pool Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants