Regression in `Rows.Scan`: incorrect handling of `sql.Scanner` implementation in v5.7.2 #2204

gazerro · 2024-12-23T10:20:25Z

The Rows.Scan method of the pgx package is documented as follows:

// Scan reads the values from the current row into dest values positionally.
// dest can include pointers to core types, values implementing the Scanner
// interface, and nil. nil will skip the value entirely. It is an error to
// call Scan without first calling Next() and checking that it returned true.

According to the documentation, it accepts as arguments:

pointers to core types,
values implementing the Scanner interface,
nil.

Starting from version v5.7.2, point (2) is no longer true because:

It no longer accepts non-pointer values implementing the sql.Scanner interface.
It accepts pointer values that do not implement sql.Scanner but point to types that do.

Both changes contradict the method's documentation. In particular:

The first change breaks existing programs that relied on the documented behavior.
The second change allows invalid values to pass through, potentially hiding parameter-passing errors.

These issues were introduced in commit 5c9b565. The issue #2146, which this commit intended to resolve, did not need fixing in the first place because the original behavior was correct. The example code in the issue is incorrect. Specifically:

var r Row  
err := rows.Scan(&r.ID, &r.Data)  
...  
type Row struct {  
	ID   int        `json:"id"`  
	Data *Timestamp `json:"data"` // <- this is a pointer  
}  
...  
func (t *Timestamp) Scan(value interface{}) error { ... }

The second argument to rows.Scan has type **T, but **T does not implement sql.Scanner. Therefore, the Scan method has no Scan implementation to invoke.

The example code should be corrected as follows:

err := rows.Scan(&r.ID, r.Data)

Here, r.Data must not be passed as a pointer.

As a result, commit 5c9b565 should be reverted to restore the correct behavior, which was consistent with both the documentation and expected usage.

The text was updated successfully, but these errors were encountered:

jackc · 2024-12-24T17:38:21Z

dest can include pointers to core types, values implementing the Scanner interface, and nil

It looks like the docs for Rows.Scan that you mention are somewhat out of date. It looks like those comments were added for v3. They are still true, but there are many more things that Rows.Scan handles now.

It no longer accepts non-pointer values implementing the sql.Scanner interface.

Do you have an example of this?

It accepts pointer values that do not implement sql.Scanner but point to types that do.

It think this matches current database/sql behavior.

As far as **T vs *T in general, pgx supports **T. This allows NULL being scanned into a nil. If not NULL, then pgx allocates a T and follows the normal scanning logic.

moukoublen · 2024-12-29T18:34:43Z

Hello,

I faced this issue after upgrading from v5.7.1 to v5.7.2.

Do you have an example of this?

My use case:

I use this wrapper in specific cases to customize the scan functionality for some db columns.

type FieldWrapper struct {
  ScanFn  func(any) error
  ValueFn func() (driver.Value, error)
}

func (f FieldWrapper) Scan(src any) error           { return f.ScanFn(src) }
func (f FieldWrapper) Value() (driver.Value, error) { return f.ValueFn() }

Example:

var customType Foo

err := row.Scan(
  FieldWrapper{ ScanFn: func(src any) error { return specialJSONUnmarshall(&customType) },
  // ...
)

In such case, for example, the element that implements sql.Scanner does not need to be a pointer since it operates on a different pointer.

(The example is simplified by a lot but the principle is that the wrapper does not need to be pointer.)

I think the commit 5c9b565 (lines) left out the check that was previously in place where was: "Does the destination -as it is- implement the sql.Scanner if so call that".

The result is that the Scan returns an error (in my case, after the upgrade) because it tries to json unmarshal to the wrapper directly.

moukoublen · 2024-12-29T19:07:43Z

After carefully reading the issue and the code committed, I found that @gazerro point is valid.

The functionality is introduced in that commit is:
-> Given a value as a destination (in rows Scan) that is a (single or multi-level) pointer and any of the pointed types in the chain, implement the sql.Scanner interface, call that.

This means that what is happening now is:

// lets say type (*Foo) implements sql.Scanner
var foo ****Foo = ... 
_ = row.Scan(foo)

It will "dereference" up to the (*Foo) level and call the Scan function of that type. So, eventually, it will operate on a different type / value than the one given in the row.Scan function, and I wonder if this should be the expected behavior.

Meanwhile, the current implementation of the isSQLScanner function ignores the case where the given destination value implements sql.Scanner directly without being a pointer, like in my case.

For example, it will return false (while it shouldn't) in cases like this:

// lets say type (*Foo) implements sql.Scanner
fw := FieldWrapper{ ... }
isSQLScanner(fw) // false

As I understand it, the decision should be one of either keeping the new behavior but fixing the isSQLScanner missing cases (I could open a PR with the fix /1/) or reverting the commit 5c9b565

/1/ Adding these lines at the beginning of the isSQLScanner function should be sufficient.

if _, is := v.(sql.Scanner); is {
	return true
}

Adding a sample testcase that shows the issue existing currently:

type Foo struct{}

func (i Foo) Scan(src any) error           { return nil }
func (i Foo) Value() (driver.Value, error) { return nil, nil }

func TestIsSQLScanner(t *testing.T) {
	f := Foo{}                      // implements sql.Scanner with non pointer receiver
	assert.True(t, isSQLScanner(f)) // fails

	pf := &f
	assert.True(t, isSQLScanner(pf)) // passes

	ppf := &pf
	assert.True(t, isSQLScanner(ppf)) // passes
}

gazerro · 2024-12-30T12:47:37Z

@jackc thank you for the quick response.

This is a test case that demonstrates the issue with the ENUM and JSONB types:

package main

import (
	"context"
	"fmt"
	"log"
	"reflect"

	"github.com/jackc/pgx/v5/pgxpool"
)

func main() {
	pool, err := pgxpool.New(context.Background(), "postgres://user:pass@localhost:5432/db")
	if err != nil {
		log.Fatal(err)
	}
	defer pool.Close()

	// Create the enum type.
	_, err = pool.Exec(context.Background(), `DROP TYPE IF EXISTS test_enum_type`)
	if err != nil {
		log.Print(err)
		return
	}
	_, err = pool.Exec(context.Background(), `CREATE TYPE test_enum_type AS ENUM ('a', 'b')`)
	if err != nil {
		log.Print(err)
		return
	}

	err = testQuery(pool, "SELECT 'a'", "a")
	if err != nil {
		log.Printf("test TEXT error: %s\n", err)
	}

	err = testQuery(pool, "SELECT 'a'::test_enum_type", "a")
	if err != nil {
		log.Printf("test ENUM error: %s\n", err)
	}

	err = testQuery(pool, "SELECT '{}'::jsonb", "{}")
	if err != nil {
		log.Printf("test JSONB error: %s\n", err)
	}
}

// T implements the sql.Scanner interface.
type T struct {
	v *any
}

func (t T) Scan(v any) error {
	*t.v = v
	return nil
}

// testQuery executes the query and checks if the scanned value matches
// the expected result.
func testQuery(pool *pgxpool.Pool, query string, expected any) error {
	rows, err := pool.Query(context.Background(), query)
	if err != nil {
		return err
	}
	var got any
	t := T{v: &got}
	for rows.Next() {
		if err := rows.Scan(t); err != nil {
			return err
		}
	}
	if err = rows.Err(); err != nil {
		return err
	}
	if !reflect.DeepEqual(got, expected) {
		return fmt.Errorf("expected %#v; got %#v", expected, got)
	}
	return nil
}

With commit 2ec9004 (the parent of commit 5c9b565), the test passes without errors. However, with commit 5c9b565, the tests for ENUM and JSONB fail. Specifically, the JSONB test sometimes hangs and occasionally results in a deadlock:

test ENUM error: can't scan into dest[0]: cannot scan into main.T
fatal error: all goroutines are asleep - deadlock!

goroutine 1 [semacquire]:
sync.runtime_Semacquire(0x140000e8190?)
	runtime/sema.go:71 +0x2c
sync.(*WaitGroup).Wait(0x140000c00c0)
	sync/waitgroup.go:118 +0x74
github.com/jackc/puddle/v2.(*Pool[...]).Close(0x1009c2e60)
	github.com/jackc/puddle/v2@v2.2.2/pool.go:195 +0x27c
github.com/jackc/pgx/v5/pgxpool.(*Pool).Close.func1()
	github.com/jackc/pgx/v5@v5.7.2/pgxpool/pool.go:387 +0x3c
sync.(*Once).doSlow(0x1004e2360?, 0x14000002380?)
	sync/once.go:76 +0xf8
sync.(*Once).Do(...)
	sync/once.go:67
github.com/jackc/pgx/v5/pgxpool.(*Pool).Close(0x14000002380?)
	github.com/jackc/pgx/v5@v5.7.2/pgxpool/pool.go:385 +0x44
panic({0x10091c000?, 0x1400000c030?})
	runtime/panic.go:785 +0x124
reflect.Value.Elem({0x10094a460?, 0x1400001e0c0?, 0x1009b5460?})
	reflect/value.go:1262 +0x198
github.com/jackc/pgx/v5/pgtype.(*scanPlanJSONToJSONUnmarshal).Scan(0x1400005e020, {0x1400022e02f, 0x2, 0x2}, {0x10094a460, 0x1400001e0c0})
	github.com/jackc/pgx/v5@v5.7.2/pgtype/json.go:215 +0x9c
github.com/jackc/pgx/v5.(*baseRows).Scan(0x14000230480, {0x1400001e0d0, 0x1, 0x100910160?})
	github.com/jackc/pgx/v5@v5.7.2/rows.go:273 +0x584
github.com/jackc/pgx/v5/pgxpool.(*poolRows).Scan(0x14000243410, {0x1400001e0d0?, 0x100cba9e0?, 0x10083368f?})
	github.com/jackc/pgx/v5@v5.7.2/pgxpool/rows.go:70 +0x34
main.testQuery(0x140000a04e0?, {0x10083368f?, 0x2?}, {0x10090cbc0, 0x1009b29c8})
	pgx-test/main.go:67 +0xfc
main.main()
	pgx-test/main.go:41 +0x320

Note that the first test on the TEXT type never fails simply because commit 5c9b565 did not modify the scanPlanCodecSQLScanner.Scan method, but only the scanPlanSQLScanner.Scan method, which seems inconsistent to me in the context of the changes made by the commit.

moukoublen · 2024-12-30T20:12:35Z

@gazerro, regarding the hanging behaviour you are facing, these are two separate things, I think.

Let me get into details:

Hanging

The reason that it hangs is that we have a panic from scanPlanJSONToJSONUnmarshal.Scan (I will get to that later; regarding why it panics), but you haven't safely (with defer) closed the already used connection from rows in the function testQuery, to return them to the pool. So when the test panics the connection stays at "used" state in pull.

then, in your main, you have defer pool.Close()

pgx/pgxpool/pool.go

Lines 384 to 389 in bcf3fbd

    
           func (p *Pool) Close() { 
        
           	p.closeOnce.Do(func() { 
        
           		close(p.closeChan) 
        
           		p.p.Close() 
        
           	}) 
        
           }

where the puddle pool close says:

https://github.com/jackc/puddle/blob/bd09d14bd4018b6d65a9d7770e2f3ddf8b00af1c/pool.go#L177-L179

// Close destroys all resources in the pool and rejects future Acquire calls.
// Blocks until all resources are returned to pool and destroyed.

So essentially, it blocks forever.

It can be fixed like this by changing testQuery function like this:

func testQuery(pool *pgxpool.Pool, query string, expected any) error {
	rows, err := pool.Query(context.Background(), query)
	if err != nil {
		return err
	}
	defer rows.Close()  // <- this is needed

That way, it will not hang, and the panic that occurs will be revealed.

Panic

After the mentioned change on the commit 5c9b565 and due to the bug/missing check of isSQLScanner, during JSONCodec.PlanScan the codec no longer returns the scanPlanSQLScanner and instead it returns the scanPlanJSONToJSONUnmarshal.

So what eventually will happen is that it will try to directly json.Unmarshall into the destination value that is given (as I said in the cases both you and I are describing, this is not wanted since the items we pass as destination value are actually sql.Scanner, just not pointer receivers.)

But why the panic?

scanPlanJSONToJSONUnmarshal is a bit aggressive on assuming that the destination value is a pointer

pgx/pgtype/json.go

Lines 215 to 219 in bcf3fbd

    
           	elem := reflect.ValueOf(dst).Elem() 
        
           	elem.Set(reflect.Zero(elem.Type())) 
        
           	return s.unmarshal(src, dst) 
        
           }

A testcase like this can reveal the panic:

func TestJSONCodecScanToNonPointerValues(t *testing.T) {
	defaultConnTestRunner.RunTest(context.Background(), t, func(ctx context.Context, t testing.TB, conn *pgx.Conn) {
		n := 44
		err := conn.QueryRow(ctx, "select '42'::jsonb").Scan(n)
		require.Error(t, err) // I would expect an error like json.Unmarshal does.
	})
}

For example, in simillar case, json.Unmarshal returns an error: https://cs.opensource.google/go/go/+/master:src/encoding/json/decode.go;l=171-175

@jackc I think there are two different issues here

Missing check on isSQLScanner; I could open a pr to fix as I suggested
Avoid panicking on scanPlanJSONToJSONUnmarshal.Scan when no pointer is provided as destination; This is also a trivial fix, the testcase is already above I could add it to the same or a different PR.

PS What happens in isSQLScanner (missing cases where the value is already a scanner without being a pointer receiver) also occurs in getSQLScanner. So most probably it needs the same fix as well.

Fix #2204

jackc · 2024-12-31T02:36:21Z

Just merged #2213 which should resolve the issue.

gazerro added the bug label Dec 23, 2024

gazerro closed this as completed Dec 30, 2024

gazerro reopened this Dec 30, 2024

moukoublen added a commit to moukoublen/pgx that referenced this issue Dec 30, 2024

fix jackc#2204

5a50d4a

moukoublen mentioned this issue Dec 30, 2024

Fix #2204 #2213

Merged

jackc closed this as completed in 6e9fa42 Dec 31, 2024

jackc added a commit that referenced this issue Dec 31, 2024

Merge pull request #2213 from moukoublen/fix_2204

c2175fe

Fix #2204

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression in `Rows.Scan`: incorrect handling of `sql.Scanner` implementation in v5.7.2 #2204

Regression in `Rows.Scan`: incorrect handling of `sql.Scanner` implementation in v5.7.2 #2204

gazerro commented Dec 23, 2024

jackc commented Dec 24, 2024

moukoublen commented Dec 29, 2024 •

edited

Loading

moukoublen commented Dec 29, 2024 •

edited

Loading

gazerro commented Dec 30, 2024

moukoublen commented Dec 30, 2024 •

edited

Loading

jackc commented Dec 31, 2024

Regression in Rows.Scan: incorrect handling of sql.Scanner implementation in v5.7.2 #2204

Regression in Rows.Scan: incorrect handling of sql.Scanner implementation in v5.7.2 #2204

Comments

gazerro commented Dec 23, 2024

jackc commented Dec 24, 2024

moukoublen commented Dec 29, 2024 • edited Loading

moukoublen commented Dec 29, 2024 • edited Loading

gazerro commented Dec 30, 2024

moukoublen commented Dec 30, 2024 • edited Loading

Hanging

Panic

jackc commented Dec 31, 2024

Regression in `Rows.Scan`: incorrect handling of `sql.Scanner` implementation in v5.7.2 #2204

Regression in `Rows.Scan`: incorrect handling of `sql.Scanner` implementation in v5.7.2 #2204

moukoublen commented Dec 29, 2024 •

edited

Loading

moukoublen commented Dec 29, 2024 •

edited

Loading

moukoublen commented Dec 30, 2024 •

edited

Loading