Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V7 Select before Where no longer working #3009

Open
CptWesley opened this issue Mar 4, 2024 · 12 comments
Open

V7 Select before Where no longer working #3009

CptWesley opened this issue Mar 4, 2024 · 12 comments
Labels

Comments

@CptWesley
Copy link
Contributor

CptWesley commented Mar 4, 2024

Hi. We are currently working on upgrading from V6 to V7 but are running into the following issue.

Given the following document definition:

public sealed class Foo
{
    public string Id { get; set; }

    public Bar Inner { get; set; }
}

public sealed class Bar
{
    public int Value { get; set; }
}

We then run the following queries:

using (var storeSession = store.LightweightSession())
{
    storeSession.Store(new Foo
    {
        Id = "foo",
        Inner = new()
        {
            Value = 100,
        },
    });

    await storeSession.SaveChangesAsync();
}

using var querySession1 = store.QuerySession();
using var querySession2 = store.QuerySession();

var r1 = await querySession1
    .Query<Foo>()
    .Where(x => x.Inner.Value < 50)
    .ToListAsync();

var r2 = await querySession2
    .Query<Foo>()
    .Select(x => x.Inner)
    .Where(x => x.Value < 50)
    .ToListAsync();

Now r1 is an empty list as expected, however r2 contains the inserted document with id foo. In practice we find that this form of querying returns all documents in the document store of that type. When we examine the generated query we find:

select d.data -> 'inner' as data from public.mt_doc_repository_specs_foo as d;

It appears the .Where clause after the .Select is ignored.

In the previous version of Marten (6.4.1) this way of querying worked fine. I am aware that this major version has changed the way Linq queries are constructed, so I'm not sure if this change was intentional.

For completion sake, this is the query that was generated in 6.4.1:

select d.data ->> 'inner' as data from public.mt_doc_repository_specs_foo as d where CAST(d.data -> 'inner' ->> 'value' as integer) < :p0
@mysticmind
Copy link
Member

Just as a test, can you please switch the order of Where and Select so that select is after where in the second query and let me know what happens?

@CptWesley
Copy link
Contributor Author

@mysticmind That one does appear to be working:

var r1 = await querySession
    .Query<Foo>()
    .Where(x => x.Inner.Value < 50)
    .ToListAsync();

// []

var cmd1 = querySession
    .Query<Foo>()
    .Where(x => x.Inner.Value < 50)
    .ToCommand(FetchType.FetchMany).CommandText;

// select d.data from public.mt_doc_repository_specs_foo as d where CAST(d.data -> 'inner' ->> 'value' as integer) < :p0;

var r2 = await querySession
    .Query<Foo>()
    .Select(x => x.Inner)
    .Where(x => x.Value < 50)
    .ToListAsync();

// ["foo"]

var cmd2 = querySession
    .Query<Foo>()
    .Select(x => x.Inner)
    .Where(x => x.Value < 50)
    .ToCommand(FetchType.FetchMany).CommandText;

// select d.data -> 'inner' as data from public.mt_doc_repository_specs_foo as d;

var r3 = await querySession
    .Query<Foo>()
    .Where(x => x.Inner.Value < 50)
    .Select(x => x.Inner)
    .ToListAsync();

// []

var cmd3 = querySession
    .Query<Foo>()
    .Where(x => x.Inner.Value < 50)
    .Select(x => x.Inner)
    .ToCommand(FetchType.FetchMany).CommandText;

// select d.data -> 'inner' as data from public.mt_doc_repository_specs_foo as d where CAST(d.data -> 'inner' ->> 'value' as integer) < :p0;

@jeremydmiller
Copy link
Member

It's not intentional, but no, just no. We're not going to support a Where() chained after a Select(). I think it's a happy accident that worked at all in < 7.0. We'll "fix" this with a validation error saying "don't do that"

@CptWesley
Copy link
Contributor Author

I'm not entirely sure why you wouldn't want to support such queries. We work a lot with (deep) nested documents. Being able to "unpeel" the first few layers is very useful for making understandable queries. I'm also not really sure of any technical/performance objections to supporting this. So if you could explain to me why you wouldn't want to support this, that would be appreciated.

@jeremydmiller
Copy link
Member

The Where() after the Select() applies to after the transformation. We could technically do that, but it would throw you into using CTEs that are pretty slow. You'd be much, much better off using the Where() on the initial document collection

This is a lot more work than it might look like from the outside.

@CptWesley
Copy link
Contributor Author

Couldn't the Selects be "hoisted" (or more accurately, whatever the opposite of hoisting is) to the end of the query by some form of query planner?

Given some definition:

class A {
  B B { get; set; }
  int Value1 { get; set; }
}

class B {
  C C { get; set; }
  int Value2 { get; set; }
}

class C {
  int Value3 { get; set; }
}

An IQueryable of the form:

query
  .Where(a => a.Value1 > 5)
  .Select(a => a.B)
  .Where(b => b.Value2 > 10)
  .Select(b => b.C)
  .Where(c => c.Value3 > 20)

Can be rewritten as:

query
  .Where(a => a.Value1 > 5)
  .Where(a => a.B.Value2 > 10)
  .Where(a => a.B.C.Value3 > 20)
  .Select(a => a.B)
  .Select(b => b.C)

or more condensed:

query
  .Where(a => a.Value1 > 5 && a.B.Value2 > 10 && a.B.C.Value3 > 20)
  .Select(a => a.B.C)

Semantically these are all equivalent (I think).

@jeremydmiller
Copy link
Member

If you want to have the relatively inefficient CTE query style support to make this work, that's not horrible to do. If you want this to generate more efficient SQL by "hoisting" the query planning somehow, that's a ton more work and I think you're veering into some pretty serious "I take pull requests" territory here. I'd be happy to talk about a paid JasperFx engagement to do that, but I don't think you'd get enough value out of it to justify the cost.

@CptWesley
Copy link
Contributor Author

For further context:

If I look at the query generated by 6.4.1 for the first example above:

query
    .Where(a => a.Value1 > 5)
    .Select(a => a.B)
    .Where(b => b.Value2 > 10)
    .Select(b => b.C)
    .Where(c => c.Value3 > 20)

The query generated by 6.4.1 is:

select d.data -> 'b' ->> 'c' as data from public.mt_doc_repository_specs_a as d where (CAST(d.data ->> 'value1' as integer) > :p0 and  CAST(d.data -> 'b' ->> 'value2' as integer) > :p1 and  CAST(d.data -> 'b' -> 'c' ->> 'value3' as integer) > :p2)

Which seems to do the hoisting (albeit potentially unintentionally).

@CptWesley
Copy link
Contributor Author

If this would be something you would want to support, I could potentially look into it in the future (no promises though).

@jeremydmiller
Copy link
Member

It might have been some kind of automatic thing that Relinq did for us that I wasn't aware of. It's not a set of use cases I would have even thought to have tested.

As for supporting it, sure, it's just not something I can prioritize anytime soon

@SamGorbo
Copy link

SamGorbo commented Nov 22, 2024

Hiya!
While migrating from Marten 6 to 7, we also countered this kind of issues:

        var queryString = session.Query<CaseDocument>()
            .Where(x => x.Id.IsOneOf(caseDocumentIds))

Produces sql select d.id, d.data, d.mt_deleted, d.mt_deleted_at from public.mt_doc_casedocument as d where (d.mt_deleted = False and d.id = ANY(:p0));

        var query2String = session.Query<CaseDocument>()
            .Select(x => x.Id)
            .Where(x => x.IsOneOf(caseDocumentIds))

Produces select d.id from public.mt_doc_casedocument as d where d.mt_deleted = False;

Reading this, I understand that this is a difficult feature to support. Just to inform people using the library, would it be possible to add some documentation explaining how queries are constructed?

@CptWesley
Copy link
Contributor Author

CptWesley commented Nov 22, 2024

@SamGorbo we ended up working around this issue by writing our own query builder class (with identical API) which collects all these operations irrespective of order and when performing the actual query, put them in the order required by Marten. Unfortunately the code is propietary so I can't share it, but I hope it can help a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants