Fix flaky test in testApproximateRangeWithSizeOverDefault by adjusting totalHits assertion logic #16434

inpink · 2024-10-22T17:50:16Z

Description

This PR addresses an issue in the testApproximateRangeWithSizeOverDefault method of ApproximatePointRangeQueryTests, where the test would occasionally fail due to how Lucene handles total hits.
By default, search() in Lucene's IndexSearcher provides an accurate count for up to 1000 hits.
Beyond this threshold, Lucene may return a lower bound using GREATER_THAN_OR_EQUAL_TO for performance reasons (refer to the Lucene IndexSearcher documentation.)

In testApproximateRangeWithSizeOverDefault, the search range includes 12,001 documents, and the test would sometimes fail when GREATER_THAN_OR_EQUAL_TO occurred during the search.

Changes:

If totalHits.relation is EQUAL_TO, the test checks for an exact count of 11000.
If totalHits.relation is GREATER_THAN_OR_EQUAL_TO, the test ensures the hits are no less than 11000 and within the upper bound (maxHits).

This issue is similar to OpenSearch PR #4270. I resolved it in a similar way. Special thanks to @dbwiddis for the valuable guidance.

Related Issues

Related #15807

Check List

Functionality includes testing.
~~API changes companion pull request created, if applicable.~~
~~Public documentation issue/PR created, if applicable.~~

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

github-actions · 2024-10-22T19:05:49Z

✅ Gradle check result for 9230ea4: SUCCESS

codecov · 2024-10-22T19:06:13Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.94%. Comparing base (ca40ba4) to head (bfb040a).
Report is 7 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #16434      +/-   ##
============================================
- Coverage     72.09%   71.94%   -0.15%     
+ Complexity    65013    64943      -70     
============================================
  Files          5313     5313              
  Lines        303315   303315              
  Branches      43888    43888              
============================================
- Hits         218661   218211     -450     
- Misses        66721    67207     +486     
+ Partials      17933    17897      -36

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

msfroh

Oh! This is probably because the approximation tries to chop off at size results for each segment. If the documents are split across multiple segments, we can end up returning all of the docs (since the approximation collects everything for each).

Thanks for fixing this @inpink!

dbwiddis

Good to see a test for an "approximate" query has "approximate" tests ;)

…ing totalHits assertion logic (opensearch-project#15807) - Updated the test to account for Lucene's behavior where `IndexSearcher.search()` may return `GREATER_THAN_OR_EQUAL_TO` for totalHits when the number of matches exceeds 1000. - Added logic to check if `totalHits.relation` is `EQUAL_TO`. If so, assert that the count is exactly 11000. Otherwise, ensure the count is at least 11000 and within the allowed upper limit (`maxHits`). - This change prevents intermittent test failures caused by Lucene’s performance optimizations. Signed-off-by: inpink <inpink@kakao.com>

github-actions · 2024-10-23T06:36:33Z

✅ Gradle check result for bfb040a: SUCCESS

inpink · 2024-10-23T07:53:28Z

@msfroh Glad I could help! Thanks for clarifying the behavior with segments :) It makes a lot more sense now. Also, it’s ready for merge!

inpink · 2024-10-23T07:56:40Z

Thank you, @dbwiddis ! I’m glad the test now appropriately reflects the “approximate” nature of the query. XD
I really appreciate your guidance throughout this process—it was crucial in finding the right solution.
I’ll continue to stay engaged with open-source projects and learn more along the way.
Thanks again for your valuable help!

…ing totalHits assertion logic (#15807) (#16434) - Updated the test to account for Lucene's behavior where `IndexSearcher.search()` may return `GREATER_THAN_OR_EQUAL_TO` for totalHits when the number of matches exceeds 1000. - Added logic to check if `totalHits.relation` is `EQUAL_TO`. If so, assert that the count is exactly 11000. Otherwise, ensure the count is at least 11000 and within the allowed upper limit (`maxHits`). - This change prevents intermittent test failures caused by Lucene’s performance optimizations. Signed-off-by: inpink <inpink@kakao.com> (cherry picked from commit 66f0110) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

…ing totalHits assertion logic (#15807) (#16434) (#16459) - Updated the test to account for Lucene's behavior where `IndexSearcher.search()` may return `GREATER_THAN_OR_EQUAL_TO` for totalHits when the number of matches exceeds 1000. - Added logic to check if `totalHits.relation` is `EQUAL_TO`. If so, assert that the count is exactly 11000. Otherwise, ensure the count is at least 11000 and within the allowed upper limit (`maxHits`). - This change prevents intermittent test failures caused by Lucene’s performance optimizations. (cherry picked from commit 66f0110) Signed-off-by: inpink <inpink@kakao.com> Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

inpink requested review from anasalkouz, andrross, ashking94, Bukhtawar, CEHENKLE, dblock, dbwiddis, gbbafna, jed326, kotwanikunal, mch2, msfroh, nknize, owaiskazi19, reta, Rishikesh1159, sachinpkale, saratvemulapalli, shwetathareja, sohami and VachaShah as code owners October 22, 2024 17:50

inpink mentioned this pull request Oct 22, 2024

Fix flaky test in testApproximateRangeWithSizeOverDefault by adjusting totalHits assertion logic #16433

Closed

3 tasks

inpink force-pushed the 15807-solve branch from 9df28f6 to 9230ea4 Compare October 22, 2024 17:52

msfroh approved these changes Oct 22, 2024

View reviewed changes

dbwiddis approved these changes Oct 23, 2024

View reviewed changes

inpink force-pushed the 15807-solve branch from 9230ea4 to 7533e97 Compare October 23, 2024 05:14

inpink requested review from jainankitk and linuxpi as code owners October 23, 2024 05:14

inpink force-pushed the 15807-solve branch from 7533e97 to bfb040a Compare October 23, 2024 05:16

dbwiddis added the backport 2.x Backport to 2.x branch label Oct 23, 2024

dbwiddis approved these changes Oct 23, 2024

View reviewed changes

dbwiddis merged commit 66f0110 into opensearch-project:main Oct 23, 2024
41 of 42 checks passed

opensearch-trigger-bot bot mentioned this pull request Oct 23, 2024

[Backport 2.x] Fix flaky test in testApproximateRangeWithSizeOverDefault by adjusting totalHits assertion logic #16459

Merged

This was referenced Oct 23, 2024

[AUTOCUT] Gradle Check Flaky Test Report for SegmentReplicationAllocationIT #14327

Open

[AUTOCUT] Gradle Check Flaky Test Report for ChildQuerySearchIT #15907

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky test in testApproximateRangeWithSizeOverDefault by adjusting totalHits assertion logic #16434

Fix flaky test in testApproximateRangeWithSizeOverDefault by adjusting totalHits assertion logic #16434

inpink commented Oct 22, 2024

github-actions bot commented Oct 22, 2024

codecov bot commented Oct 22, 2024 •

edited

Loading

msfroh left a comment

dbwiddis left a comment

github-actions bot commented Oct 23, 2024

inpink commented Oct 23, 2024 •

edited

Loading

inpink commented Oct 23, 2024

Fix flaky test in testApproximateRangeWithSizeOverDefault by adjusting totalHits assertion logic #16434

Fix flaky test in testApproximateRangeWithSizeOverDefault by adjusting totalHits assertion logic #16434

Conversation

inpink commented Oct 22, 2024

Description

Related Issues

Check List

github-actions bot commented Oct 22, 2024

codecov bot commented Oct 22, 2024 • edited Loading

Codecov Report

msfroh left a comment

Choose a reason for hiding this comment

dbwiddis left a comment

Choose a reason for hiding this comment

github-actions bot commented Oct 23, 2024

inpink commented Oct 23, 2024 • edited Loading

inpink commented Oct 23, 2024

codecov bot commented Oct 22, 2024 •

edited

Loading

inpink commented Oct 23, 2024 •

edited

Loading