Skip to content

Commit

Permalink
Push rescuer down into internal/maintenance + JobRow to rivertype
Browse files Browse the repository at this point in the history
Backstory: We've made some attempt to modularize River so that it's
entire implementation doesn't sit in one top-level package. We had some
success with this approach (see `internal/`), but after making some
progress we eventually hit a brick wall with most of the core types, and
even regressed as one of the queue maintainers (rescuer) had to live in
the top-level package so it could access worker-related information.

Modularity aside: I realized the other day that we also have another
problem on the horizon. We have a new driver system, but current drivers
like `riverpgxv5` cheat by implementing a trivial wrapper around pgx.
Long term, these drives are going to have to look more like how
`DBAdapter` looks today. They'll have functions like `JobInsert` and
`JobComplete` specific to the particular driver implementation, and
those functions will need to take and return types representing things
like insert parameters and job return values. It might be possible to
use types from `river` for this, but we'd have to invert the dependency
requirement from `river` -> `riverpgxv5` and change a lot of tests.
Being able to depend on a shared submodule (which `rivertype` could
become) would be a lot easier.

 ## Approach

I wanted to try and fix that and pursue further modularization, but it
wasn't easy. I tried every trick in the book to cast types from
something internal to public-facing etc., but as implemented, it's
either not possible, or only possible with considerable downsides (see
bottom).

However, there is a change we can make that unlocks everything: `JobRow`
is the core type that needs to be shared amongst everything. If it could
live in a shared package, internal packages get the ability to reference
it and whole puzzle unjumbles into crystal clarity.

Previously, I'd convinced you that we could put core types into a
separate `rivertype` package, which we did, and then later undid after
finding a few techniques to share some internally referenced types.

Here, we bring that idea back to some degree, but only for one type:
`JobRow` (and okay, `AttemptError` too). This doesn't produce anywhere
near the same degree of required API adjustments because `JobRow` is
generally something that's returned from functions, and therefore
`rivertype` never needs to be referenced. It's also embedded on `Job`,
so its properties are available there again without importing
`rivertype`. `Job` stays in the top-level package so worker `Work`
signatures are not affected.

Signatures of the error handler and a few other types are affected, but
it's certainly the less common case.

With that move in place we go on to demonstrate that further
encapsulation becomes possible. We move the new work unit primitives
into a shared internal package, and then make use of them in the rescuer
so that it can decouple itself from the concept of a top-level worker.
With that done, we can push it into `internal/maintenance` where it was
supposed to be originally.

Another benefit: when introducing `rivertest`, we had to duplicate the
`jobRowFromInternal` function because of the same type problems
mentioned above. With `JobRow` not in a shared package, we can push this
helper into `dbsqlc` and have it shared amongst `river` and `rivertest`,
eliminating the necessity for this extra function.

And from here all the other pieces can fall too: job executor, producer,
etc. could all be encapsulated separately if we'd like them to be.

 ## Alternatives considered

 ### Struct conversion by removing `AttemptError`

Go allows struct to easily be converted from one to another so you could
potentially have `river.JobRow(internalJobRow)`, but the limitation is
that you can't do it as soon as substructs (e.g. `AttempError`) are
introduced, even if the substructs are also identical to each other.

One thing that we could potentially do here is change `AttemptError`
back to a plain set of `[]byte`s and add an unmarshaling helper like
`UnmarshalAttemptErrors() []AttemptError`. I think it'd work, but also
has downsides of its own:

* Prevents us from having adding other substructs to the job row in the
  future.

* Caching unmarshaled attempt errors is difficult because if either
  struct picked up an unexported member, it can no longer be converted
  to the other.

* The conversions and transform wrappers that'd need to happen to pass
  back and forth with interfaces like `ErrorHandler` would be tricky and
  add probably cruft.

 ### Use of unsafe

Another trick you can do similar to the above: if you're sure two
structs really are identical, you can convert one to the other with
`unsafe.Pointer`, even if they include substructs like `AttemptError`.

I made a prototype of this and it definitely works, so I really
considered this as a good possible way forward.

However, use of the `unsafe` package is frowned upon, and it's also
banned from some hosted providers like Google App Engine, which seems
bad for our use case.
  • Loading branch information
brandur committed Nov 20, 2023
1 parent b35571f commit f531209
Show file tree
Hide file tree
Showing 22 changed files with 450 additions and 439 deletions.
31 changes: 19 additions & 12 deletions client.go
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ import (
"github.com/riverqueue/river/internal/rivercommon"
"github.com/riverqueue/river/internal/util/sliceutil"
"github.com/riverqueue/river/internal/util/valutil"
"github.com/riverqueue/river/internal/workunit"
"github.com/riverqueue/river/riverdriver"
"github.com/riverqueue/river/rivertype"
)

const (
Expand Down Expand Up @@ -286,7 +288,7 @@ type clientTestSignals struct {
jobCleaner *maintenance.JobCleanerTestSignals
periodicJobEnqueuer *maintenance.PeriodicJobEnqueuerTestSignals
reindexer *maintenance.ReindexerTestSignals
rescuer *rescuerTestSignals
rescuer *maintenance.RescuerTestSignals
scheduler *maintenance.SchedulerTestSignals
}

Expand Down Expand Up @@ -366,9 +368,9 @@ func NewClient[TTx any](driver riverdriver.Driver[TTx], config *Config) (*Client
// For convenience, in case the user's specified a large JobTimeout but no
// RescueStuckJobsAfter, since RescueStuckJobsAfter must be greater than
// JobTimeout, set a reasonable default value that's longer thah JobTimeout.
rescueAfter := defaultRescueAfter
rescueAfter := maintenance.DefaultRescueAfter
if config.JobTimeout > 0 && config.RescueStuckJobsAfter < 1 && config.JobTimeout > config.RescueStuckJobsAfter {
rescueAfter = config.JobTimeout + defaultRescueAfter
rescueAfter = config.JobTimeout + maintenance.DefaultRescueAfter
}

// Create a new version of config with defaults filled in. This replaces the
Expand Down Expand Up @@ -512,10 +514,15 @@ func NewClient[TTx any](driver riverdriver.Driver[TTx], config *Config) (*Client
}

{
rescuer := newRescuer(archetype, &rescuerConfig{
rescuer := maintenance.NewRescuer(archetype, &maintenance.RescuerConfig{
ClientRetryPolicy: retryPolicy,
RescueAfter: config.RescueStuckJobsAfter,
Workers: config.Workers,
WorkUnitFactoryFunc: func(kind string) workunit.WorkUnitFactory {
if workerInfo, ok := config.Workers.workersMap[kind]; ok {
return workerInfo.workUnitFactory
}
return nil
},
}, driver.GetDBPool())
maintenanceServices = append(maintenanceServices, rescuer)
client.testSignals.rescuer = &rescuer.TestSignals
Expand Down Expand Up @@ -758,7 +765,7 @@ func (c *Client[TTx]) Subscribe(kinds ...EventKind) (<-chan *Event, func()) {
}

// Distribute a single job into any listening subscriber channels.
func (c *Client[TTx]) distributeJob(job *JobRow, stats *JobStatistics) {
func (c *Client[TTx]) distributeJob(job *rivertype.JobRow, stats *JobStatistics) {
c.subscriptionsMu.Lock()
defer c.subscriptionsMu.Unlock()

Expand Down Expand Up @@ -809,7 +816,7 @@ func (c *Client[TTx]) distributeJobCompleterCallback(update jobcompleter.Complet
c.statsNumJobs++
}()

c.distributeJob(jobRowFromInternal(update.Job), jobStatisticsFromInternal(update.JobStats))
c.distributeJob(dbsqlc.JobRowFromInternal(update.Job), jobStatisticsFromInternal(update.JobStats))
}

// Dump aggregate stats from job completions to logs periodically. These
Expand Down Expand Up @@ -963,7 +970,7 @@ func insertParamsFromArgsAndOptions(args JobArgs, insertOpts *InsertOpts) (*dbad
insertParams.UniqueByArgs = uniqueOpts.ByArgs
insertParams.UniqueByQueue = uniqueOpts.ByQueue
insertParams.UniqueByPeriod = uniqueOpts.ByPeriod
insertParams.UniqueByState = sliceutil.Map(uniqueOpts.ByState, func(s JobState) dbsqlc.JobState { return dbsqlc.JobState(s) })
insertParams.UniqueByState = sliceutil.Map(uniqueOpts.ByState, func(s rivertype.JobState) dbsqlc.JobState { return dbsqlc.JobState(s) })
}

if !insertOpts.ScheduledAt.IsZero() {
Expand All @@ -986,7 +993,7 @@ var errInsertNoDriverDBPool = fmt.Errorf("driver must have non-nil database pool
// if err != nil {
// // handle error
// }
func (c *Client[TTx]) Insert(ctx context.Context, args JobArgs, opts *InsertOpts) (*JobRow, error) {
func (c *Client[TTx]) Insert(ctx context.Context, args JobArgs, opts *InsertOpts) (*rivertype.JobRow, error) {
if c.driver.GetDBPool() == nil {
return nil, errInsertNoDriverDBPool
}
Expand All @@ -1005,7 +1012,7 @@ func (c *Client[TTx]) Insert(ctx context.Context, args JobArgs, opts *InsertOpts
return nil, err
}

return jobRowFromInternal(res.Job), nil
return dbsqlc.JobRowFromInternal(res.Job), nil
}

// InsertTx inserts a new job with the provided args on the given transaction.
Expand All @@ -1022,7 +1029,7 @@ func (c *Client[TTx]) Insert(ctx context.Context, args JobArgs, opts *InsertOpts
// This variant lets a caller insert jobs atomically alongside other database
// changes. An inserted job isn't visible to be worked until the transaction
// commits, and if the transaction rolls back, so too is the inserted job.
func (c *Client[TTx]) InsertTx(ctx context.Context, tx TTx, args JobArgs, opts *InsertOpts) (*JobRow, error) {
func (c *Client[TTx]) InsertTx(ctx context.Context, tx TTx, args JobArgs, opts *InsertOpts) (*rivertype.JobRow, error) {
if err := c.validateJobArgs(args); err != nil {
return nil, err
}
Expand All @@ -1037,7 +1044,7 @@ func (c *Client[TTx]) InsertTx(ctx context.Context, tx TTx, args JobArgs, opts *
return nil, err
}

return jobRowFromInternal(res.Job), nil
return dbsqlc.JobRowFromInternal(res.Job), nil
}

// InsertManyParams encapsulates a single job combined with insert options for
Expand Down
39 changes: 20 additions & 19 deletions client_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import (
"github.com/riverqueue/river/internal/util/sliceutil"
"github.com/riverqueue/river/internal/util/valutil"
"github.com/riverqueue/river/riverdriver/riverpgxv5"
"github.com/riverqueue/river/rivertype"
)

func waitForClientHealthy(ctx context.Context, t *testing.T, statusUpdateCh <-chan componentstatus.ClientSnapshot) {
Expand Down Expand Up @@ -1062,7 +1063,7 @@ func Test_Client_ErrorHandler(t *testing.T) {
return client, &testBundle{SubscribeChan: subscribeChan}
}

requireInsert := func(ctx context.Context, client *Client[pgx.Tx]) *JobRow {
requireInsert := func(ctx context.Context, client *Client[pgx.Tx]) *rivertype.JobRow {
job, err := client.Insert(ctx, callbackArgs{}, nil)
require.NoError(t, err)
return job
Expand All @@ -1078,7 +1079,7 @@ func Test_Client_ErrorHandler(t *testing.T) {

var errorHandlerCalled bool
config.ErrorHandler = &testErrorHandler{
HandleErrorFunc: func(ctx context.Context, job *JobRow, err error) *ErrorHandlerResult {
HandleErrorFunc: func(ctx context.Context, job *rivertype.JobRow, err error) *ErrorHandlerResult {
require.Equal(t, handlerErr, err)
errorHandlerCalled = true
return &ErrorHandlerResult{}
Expand All @@ -1100,7 +1101,7 @@ func Test_Client_ErrorHandler(t *testing.T) {

var errorHandlerCalled bool
config.ErrorHandler = &testErrorHandler{
HandleErrorFunc: func(ctx context.Context, job *JobRow, err error) *ErrorHandlerResult {
HandleErrorFunc: func(ctx context.Context, job *rivertype.JobRow, err error) *ErrorHandlerResult {
var unknownJobKindErr *UnknownJobKindError
require.ErrorAs(t, err, &unknownJobKindErr)
require.Equal(t, *unknownJobKindErr, UnknownJobKindError{Kind: "RandomWorkerNameThatIsNeverRegistered"})
Expand Down Expand Up @@ -1132,7 +1133,7 @@ func Test_Client_ErrorHandler(t *testing.T) {

var panicHandlerCalled bool
config.ErrorHandler = &testErrorHandler{
HandlePanicFunc: func(ctx context.Context, job *JobRow, panicVal any) *ErrorHandlerResult {
HandlePanicFunc: func(ctx context.Context, job *rivertype.JobRow, panicVal any) *ErrorHandlerResult {
require.Equal(t, "panic val", panicVal)
panicHandlerCalled = true
return &ErrorHandlerResult{}
Expand Down Expand Up @@ -1179,7 +1180,7 @@ func Test_Client_Maintenance(t *testing.T) {
errorsBytes := make([][]byte, errorCount)
for i := 0; i < errorCount; i++ {
var err error
errorsBytes[i], err = json.Marshal(AttemptError{
errorsBytes[i], err = json.Marshal(rivertype.AttemptError{
At: time.Now(),
Error: "mocked error",
Num: i + 1,
Expand Down Expand Up @@ -1359,7 +1360,7 @@ func Test_Client_Maintenance(t *testing.T) {
startClient(ctx, t, client)

client.testSignals.electedLeader.WaitOrTimeout()
svc := maintenance.GetService[*rescuer](client.queueMaintainer)
svc := maintenance.GetService[*maintenance.Rescuer](client.queueMaintainer)
svc.TestSignals.FetchedBatch.WaitOrTimeout()
svc.TestSignals.UpdatedBatch.WaitOrTimeout()

Expand Down Expand Up @@ -1449,7 +1450,7 @@ func Test_Client_RetryPolicy(t *testing.T) {

ctx := context.Background()

requireInsert := func(ctx context.Context, client *Client[pgx.Tx]) *JobRow {
requireInsert := func(ctx context.Context, client *Client[pgx.Tx]) *rivertype.JobRow {
job, err := client.Insert(ctx, callbackArgs{}, nil)
require.NoError(t, err)
return job
Expand Down Expand Up @@ -1527,7 +1528,7 @@ func Test_Client_RetryPolicy(t *testing.T) {
// how it would've looked after being run through the queue.
originalJob.Attempt += 1

expectedNextScheduledAt := client.config.RetryPolicy.NextRetry(jobRowFromInternal(originalJob))
expectedNextScheduledAt := client.config.RetryPolicy.NextRetry(dbsqlc.JobRowFromInternal(originalJob))

t.Logf("Attempt number %d scheduled %v from original `attempted_at`",
originalJob.Attempt, finishedJob.ScheduledAt.Sub(*originalJob.AttemptedAt))
Expand Down Expand Up @@ -1571,7 +1572,7 @@ func Test_Client_Subscribe(t *testing.T) {
})
}

requireInsert := func(ctx context.Context, client *Client[pgx.Tx], jobName string) *JobRow {
requireInsert := func(ctx context.Context, client *Client[pgx.Tx], jobName string) *rivertype.JobRow {
job, err := client.Insert(ctx, callbackArgs{Name: jobName}, nil)
require.NoError(t, err)
return job
Expand Down Expand Up @@ -1599,7 +1600,7 @@ func Test_Client_Subscribe(t *testing.T) {
jobFailed1 := requireInsert(ctx, client, "failed1")
jobFailed2 := requireInsert(ctx, client, "failed2")

expectedJobs := []*JobRow{
expectedJobs := []*rivertype.JobRow{
jobCompleted1,
jobCompleted2,
jobFailed1,
Expand Down Expand Up @@ -1663,7 +1664,7 @@ func Test_Client_Subscribe(t *testing.T) {
jobCompleted := requireInsert(ctx, client, "completed1")
requireInsert(ctx, client, "failed1")

expectedJobs := []*JobRow{
expectedJobs := []*rivertype.JobRow{
jobCompleted,
}

Expand Down Expand Up @@ -1704,7 +1705,7 @@ func Test_Client_Subscribe(t *testing.T) {
requireInsert(ctx, client, "completed1")
jobFailed := requireInsert(ctx, client, "failed1")

expectedJobs := []*JobRow{
expectedJobs := []*rivertype.JobRow{
jobFailed,
}

Expand Down Expand Up @@ -2395,7 +2396,7 @@ func Test_NewClient_Validations(t *testing.T) {
config.JobTimeout = 23 * time.Hour
},
validateResult: func(t *testing.T, client *Client[pgx.Tx]) { //nolint:thelper
require.Equal(t, 23*time.Hour+defaultRescueAfter, client.config.RescueStuckJobsAfter)
require.Equal(t, 23*time.Hour+maintenance.DefaultRescueAfter, client.config.RescueStuckJobsAfter)
},
},
{
Expand Down Expand Up @@ -2740,7 +2741,7 @@ func TestInsert(t *testing.T) {
name string
args noOpArgs
opts *InsertOpts
assert func(t *testing.T, args *noOpArgs, opts *InsertOpts, insertedJob *JobRow)
assert func(t *testing.T, args *noOpArgs, opts *InsertOpts, insertedJob *rivertype.JobRow)
}{
{
name: "all options specified",
Expand All @@ -2752,7 +2753,7 @@ func TestInsert(t *testing.T) {
ScheduledAt: now.Add(time.Hour).In(time.FixedZone("UTC-5", -5*60*60)),
Tags: []string{"tag1", "tag2"},
},
assert: func(t *testing.T, args *noOpArgs, opts *InsertOpts, insertedJob *JobRow) {
assert: func(t *testing.T, args *noOpArgs, opts *InsertOpts, insertedJob *rivertype.JobRow) {
t.Helper()

require := require.New(t)
Expand All @@ -2768,14 +2769,14 @@ func TestInsert(t *testing.T) {
require.Equal(JobStateScheduled, insertedJob.State)
require.Equal("noOp", insertedJob.Kind)
// default state:
require.Equal([]byte("{}"), insertedJob.metadata)
// require.Equal([]byte("{}"), insertedJob.metadata)
},
},
{
name: "all defaults",
args: noOpArgs{Name: "testJob"},
opts: nil,
assert: func(t *testing.T, args *noOpArgs, opts *InsertOpts, insertedJob *JobRow) {
assert: func(t *testing.T, args *noOpArgs, opts *InsertOpts, insertedJob *rivertype.JobRow) {
t.Helper()

require := require.New(t)
Expand All @@ -2790,7 +2791,7 @@ func TestInsert(t *testing.T) {
// Default comes from database now(), and we can't know the exact value:
require.WithinDuration(time.Now(), insertedJob.ScheduledAt, 2*time.Second)
require.Equal([]string{}, insertedJob.Tags)
require.Equal([]byte("{}"), insertedJob.metadata)
// require.Equal([]byte("{}"), insertedJob.metadata)
},
},
}
Expand Down Expand Up @@ -2872,7 +2873,7 @@ func TestUniqueOpts(t *testing.T) {

uniqueOpts := UniqueOpts{
ByPeriod: 24 * time.Hour,
ByState: []JobState{JobStateAvailable, JobStateCompleted},
ByState: []rivertype.JobState{JobStateAvailable, JobStateCompleted},
}

job0, err := client.Insert(ctx, noOpArgs{}, &InsertOpts{
Expand Down
10 changes: 7 additions & 3 deletions error_handler.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
package river

import "context"
import (
"context"

"github.com/riverqueue/river/rivertype"
)

// ErrorHandler provides an interface that will be invoked in case of an error
// or panic occurring in the job. This is often useful for logging and exception
Expand All @@ -10,13 +14,13 @@ type ErrorHandler interface {
//
// Context is descended from the one used to start the River client that
// worked the job.
HandleError(ctx context.Context, job *JobRow, err error) *ErrorHandlerResult
HandleError(ctx context.Context, job *rivertype.JobRow, err error) *ErrorHandlerResult

// HandlePanic is invoked in case of a panic occurring in a job.
//
// Context is descended from the one used to start the River client that
// worked the job.
HandlePanic(ctx context.Context, job *JobRow, panicVal any) *ErrorHandlerResult
HandlePanic(ctx context.Context, job *rivertype.JobRow, panicVal any) *ErrorHandlerResult
}

type ErrorHandlerResult struct {
Expand Down
3 changes: 2 additions & 1 deletion event.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ import (
"time"

"github.com/riverqueue/river/internal/jobstats"
"github.com/riverqueue/river/rivertype"
)

// EventKind is a kind of event to subscribe to from a client.
Expand Down Expand Up @@ -45,7 +46,7 @@ type Event struct {
Kind EventKind

// Job contains job-related information.
Job *JobRow
Job *rivertype.JobRow

// JobStats are statistics about the run of a job.
JobStats *JobStatistics
Expand Down
5 changes: 3 additions & 2 deletions example_error_handler_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,17 @@ import (
"github.com/riverqueue/river/internal/riverinternaltest"
"github.com/riverqueue/river/internal/util/slogutil"
"github.com/riverqueue/river/riverdriver/riverpgxv5"
"github.com/riverqueue/river/rivertype"
)

type CustomErrorHandler struct{}

func (*CustomErrorHandler) HandleError(ctx context.Context, job *river.JobRow, err error) *river.ErrorHandlerResult {
func (*CustomErrorHandler) HandleError(ctx context.Context, job *rivertype.JobRow, err error) *river.ErrorHandlerResult {
fmt.Printf("Job errored with: %s\n", err)
return nil
}

func (*CustomErrorHandler) HandlePanic(ctx context.Context, job *river.JobRow, panicVal any) *river.ErrorHandlerResult {
func (*CustomErrorHandler) HandlePanic(ctx context.Context, job *rivertype.JobRow, panicVal any) *river.ErrorHandlerResult {
fmt.Printf("Job panicked with: %v\n", panicVal)

// Either function can also set the job to be immediately cancelled.
Expand Down
4 changes: 3 additions & 1 deletion insert_opts.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ import (
"fmt"
"slices"
"time"

"github.com/riverqueue/river/rivertype"
)

// InsertOpts are optional settings for a new job which can be provided at job
Expand Down Expand Up @@ -103,7 +105,7 @@ type UniqueOpts struct {
// With this setting, any jobs of the same kind that have been completed or
// discarded, but not yet cleaned out by the system, won't count towards the
// uniqueness of a new insert.
ByState []JobState
ByState []rivertype.JobState
}

// isEmpty returns true for an empty, uninitialized options struct.
Expand Down
Loading

0 comments on commit f531209

Please sign in to comment.