Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase max resolution for jitter from 1min to 2min #5862

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

saponifi3d
Copy link

@saponifi3d saponifi3d commented May 3, 2024

Description

The goal of this PR is to enable the jitter code path here:

if resolution > settings.MAX_RESOLUTION_FOR_JITTER:
if timestamp % resolution == 0:
self.__count += 1
self.__count_max_resolution += 1
return ScheduledSubscriptionTask(
datetime.fromtimestamp(timestamp), subscription_with_metadata
)
else:
return None
jitter = subscription.identifier.uuid.int % resolution
if timestamp % resolution == jitter:
self.__count += 1
return ScheduledSubscriptionTask(
datetime.fromtimestamp(timestamp - jitter), subscription_with_metadata
)
else:
return None
for queries with a resolution up to 30 minutes.

Here's what the load profile on metric alert subscriptions looks like that we're trying to fix: https://app.datadoghq.com/dashboard/byg-ici-a72/snuba-subscriptions?fromUser=true&fullscreen_end_ts=1714765439441&fullscreen_paused=false&fullscreen_refresh_mode=sliding&fullscreen_section=overview&fullscreen_start_ts=1714751039441&fullscreen_widget=3556554584510136&tpl_var_consumer_group%5B0%5D=*&tpl_var_sentry_region%5B0%5D=us&view=spans&from_ts=1714750247173&to_ts=1714764647173&live=true

Technical Spec: https://www.notion.so/sentry/Metric-Alert-Load-Shedding-cba94dd0fa434f3bb56a23f920586eb2

@saponifi3d saponifi3d requested a review from a team as a code owner May 3, 2024 21:25
@saponifi3d saponifi3d requested a review from wedamija May 3, 2024 21:26
@wedamija
Copy link
Member

wedamija commented May 3, 2024

We can have up to 30 minute resolutions when using comparison intervals, so maybe bump to 30?

@saponifi3d saponifi3d changed the title Increase max resolution for jitter from 1min to 15min Increase max resolution for jitter from 1min to 30min May 3, 2024
Copy link

codecov bot commented May 3, 2024

Test Failures Detected: Due to failing tests, we cannot provide coverage reports at this time.

❌ Failed Test Results:

Completed 640 tests with 1 failed, 638 passed and 1 skipped.

View the full list of failed tests
Test Description Failure message
Testsuite:
pytest

Test name:
tests.clickhouse.optimize.test_optimize_scheduler::test_get_next_schedule[non parallel]

Envs:
- default
Traceback (most recent call last):
File ".../clickhouse/optimize/test_optimize_scheduler.py", line 254, in test_get_next_schedule
assert optimize_scheduler.get_next_schedule(partitions) == expected
AssertionError: assert OptimizationSchedule(partitions=[["(90,'2022-06-27')",\n "(90,'2022-06-20')",\n "(90,'2022-06-13')",\n "(90,'2022-06-08')"]],\n cutoff_time=datetime.datetime(2024, 5, 5, 0, 0),\n start_time_jitter_minutes=None) == OptimizationSchedule(partitions=[["(90,'2022-06-27')",\n "(90,'2022-06-20')",\n "(90,'2022-06-13')",\n "(90,'2022-06-08')"]],\n cutoff_time=datetime.datetime(2024, 5, 4, 0, 0),\n start_time_jitter_minutes=None)

Matching attributes:
['partitions', 'start_time_jitter_minutes']
Differing attributes:
['cutoff_time']

Drill down into differing attribute cutoff_time:
cutoff_time: datetime.datetime(2024, 5, 5, 0, 0) != datetime.datetime(2024, 5, 4, 0, 0)

@saponifi3d saponifi3d changed the title Increase max resolution for jitter from 1min to 30min Increase max resolution for jitter from 1min to 2min May 3, 2024
@saponifi3d
Copy link
Author

Updated the max resolution based on this convo: https://sentry.slack.com/archives/CLTE78L73/p1714771659170539 -- will see how 2min affects things and increase to size.

I'm still looking into the test failures.

@evanh evanh requested a review from ayirr7 May 6, 2024 14:40
@MeredithAnya
Copy link
Member

@saponifi3d @wedamija is this PR still relevant?

@saponifi3d
Copy link
Author

saponifi3d commented Dec 9, 2024

@MeredithAnya i think so, we are still seeing some sawtooth in the graph for processing, this should help smooth when we execute snuba subscriptions

I can try to revive this PR and get it green, if it looks okay otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants