Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Sort values in the Hit type are wrongly always deserialized to string #1128

Closed
djivko opened this issue Aug 6, 2024 · 3 comments · Fixed by #1225
Closed

[BUG] Sort values in the Hit type are wrongly always deserialized to string #1128

djivko opened this issue Aug 6, 2024 · 3 comments · Fixed by #1225
Labels
bug Something isn't working

Comments

@djivko
Copy link

djivko commented Aug 6, 2024

What is the bug?

Executing a search request that has sorting in it will produce SearchHits results where each Hit will have it's sort array field populated. However the sort array is always deserialized as string array and if one wants to use this value to repeat the request with search_after being set to the returned value of the sort this could result in an error from the server if the field type is not actually string.

How can one reproduce the bug?

Create an index which has the following mapping

"creationDate": {
    "type": "date",
    "format": "date_time"
}

Java field definition

@Field(name = "creationDate", type = FieldType.Date, format = DateFormat.date_time)
private Date creationDate;

// Setters/Getters

Add a few documents that get indexed and then execute a search request followed by another search that utilizes search_after. For example

NativeQueryBuilder queryBuilder = NativeQuery.builder();
queryBuilder.withQuery(q -> q.matchAll(m -> m));
queryBuilder.withSort(List.of(
    (new SortOptions.Builder()).field(fb -> fb.field("creationDate").order(SortOrder.Asc))
        .build()));
queryBuilder.withPageable(PageRequest.of(0, 1));

List<Object> searchAfter = null;
boolean done = false;
while (!done) {
  queryBuilder.withSearchAfter(searchAfter);

  SearchHits<T> hits = operations.search(queryBuilder.build(), entityClass, getIndexCoordinates());
  List<SearchHit<T>> searchHits = hits.getSearchHits();

  boolean hasResults = searchHits != null && !searchHits.isEmpty();
  if (hasResults) {
    // This will always return List of strings
    searchAfter = searchHits.getLast().getSortValues();
  }

  if (!hasResults || searchAfter == null || searchAfter.isEmpty() || searchAfter.stream()
      .filter(Objects::nonNull)
      .toList()
      .isEmpty() || (searchHits.size() < 1)) {
    done = true;
  }
}

When executing the above code the server will return something like

{
    // metadata
    "hits": {
        // metadata
        "hits": [
            {
                // _source, _index, _id etc
                "sort": [
                    1720256355885
                ]
            }
        ]
    }
}

However during deserialization the client will deserialize the sort field to

"sort": [
    "1720256355885"
]

Thus repeating the request with search after will produce an error. Something like

"root_cause": [
            {
                "type": "parse_exception",
                "reason": "failed to parse date field [1720256355885] with format [date_time]: [failed to parse date field [1720256355885] with format [date_time]]"
            }
        ]

What is the expected behavior?

The returned value types of the sort field must be maintained as returned from the server.

What is your host/environment?

SpringBoot application

Do you have any additional context?

  • The issue seems to be related with the internal Hit generated class and the deserializer associated with the sort member - check here
  • Elastic java client seems to have addressed the same issue by changing the type of the sort member from List<String> to List<FieldValue>. Check this commit
@djivko djivko added bug Something isn't working untriaged labels Aug 6, 2024
@Xtansia Xtansia removed the untriaged label Aug 7, 2024
@Xtansia
Copy link
Collaborator

Xtansia commented Aug 7, 2024

Thanks for the report @djivko! This is related to #755, and will require similar fixes:

Is this something you'd be interested in possibly contributing yourself?

@djivko
Copy link
Author

djivko commented Aug 7, 2024

@Xtansia sure I can give it a shot. It might take a bit more time though as I will be doing it im my spare time.
Thanks for all the pointers!

@Xtansia
Copy link
Collaborator

Xtansia commented Oct 23, 2024

A workaround for this was released as of version 2.15.0, you can use sortVals (instead of sort) to get them as List<FieldValue> rather than List<String>.
For version 3.0.0 the sort itself will be changed to directly return List<FieldValue>.

This mirrors the workaround and future change for #755 which introduced a matching searchAfterVals on SearchRequest.

@Xtansia Xtansia closed this as completed Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants