-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move scheduling logic to the mlpop lua command #14
base: skroutz
Are you sure you want to change the base?
Conversation
35421dd
to
89b4f03
Compare
# Pops a job off a set of queues. Queue names should be strings. | ||
# | ||
# Returns an array of [queue, Ruby object] or nil. | ||
def pop_with_queue(*queues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a newly introduced method, we could use the plural in the name. We could also use from
instead of with
.
def pop_with_queue(*queues) | |
def pop_from_queues(*queues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's different with pop_with_queue()
is that it returns a pair of [queue, job]
instead of just the job
like pop()
does, thus the _with_queue
suffix. It's not that clear though.
end | ||
|
||
def pop_from_queue(*queues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] We could use the plural here for queues
and implement the pop_from_queue
calling this one to avoid breaking the api.
def pop_from_queue(*queues) | |
def pop_from_queues(*queues) | |
... | |
end | |
def pop_from_queue(queue) | |
pop_from_queues(queue) | |
end |
lib/resque/data_store.rb
Outdated
# Pop whatever is on queue | ||
def pop_from_queue(queue) | ||
@redis.lpop(redis_key_for_queue(queue)) | ||
# MLPOP key [key ..] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[super minor] missing a . in [κey ..]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall nice approach, have we benchmarked this against the multiple lpop
s from the client?
@@ -271,11 +271,23 @@ def push(queue, item) | |||
data_store.push_to_queue(queue,encode(item)) | |||
end | |||
|
|||
# Pops a job off a queue. Queue name should be a string. | |||
# Pops a job off a set of queues. Queue names should be a string. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Queue names should be a string.
That comment seems untrue, your test case below uses symbols, so let's drop it for sanity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I have adjusted the original comment. We could change it.
# is restarted) reload it. | ||
if e.message.include?("NOSCRIPT") | ||
mlpop_sha! | ||
retry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add some backoff here, failovers take some time to be completed.
Also a timeout (and/or a fallback to the old scheduling) would be nice to avoid busy loops in case of a fatal error (e.g. all redises are dead) here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The backoff logic in case of a server restart is the responsibility of the redis client.
As far as mlpop
is concerned, it gets a NOSCRIPT
error and tries to reload the script. If there is a connectivity issue (e.g. a server restart) the client should kick in, backoff and possibly retry the command internally.
We could add a hidden retry
argument that decrements itself to guard against busy loops, by I am sure it's needed in this case.
# Pops a job off a set of queues. Queue names should be strings. | ||
# | ||
# Returns an array of [queue, Ruby object] or nil. | ||
def pop_with_queue(*queues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Do we pass the queues
list ordered from resquepool here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, keeping the queue list order is critical.
It's not so easy to get exact metrics, it depends on which queues are non-empty at that specific moment. The expected speedup is also a function of the roundtrip between the worker and the server, for our case the gain would be around 0.7ms for each worker poll. A test run in our own environment (switching lpop with llen)
|
Resque finds jobs by iterating on a list of queues, for each queue it tries to lpop a job until successful. This patch moves that logic to the server by implementing mlpop, that approach reduces the required round-trips especially for workers watching a large set of queues. The API is not altered, we keep the same signature while accepting variadic arguments.
Co-authored-by: Lazarus Lazaridis <9477868+iridakos@users.noreply.github.com>
Travis builds are working now. |
Resque finds jobs by iterating on a list of queues, for each queue it
tries to lpop a job until successful.
This patch moves that logic to the server by implementing mlpop,
that approach reduces the required round-trips especially for workers
watching a large set of queues.
The API is not altered, we keep the same signature while accepting
variadic arguments.