Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize slice calculation in IndexSearcher a little #13860

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

original-brownbear
Copy link
Member

Fewer volatile reads, less indirection and a fast-path for when there's no executor. Also, saving some copies, sorting array instead of list, and saving allocations all around. This PR is obviously not a big win but in aggregate it's quite measurable and mostly deals with tiny regressions introduced recently. So opening this as a suggestion for dealing with that boiling frog :)

Luceneutil over 40 rounds does show small but significant improvements:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
                      AndHighLow     1495.00      (5.8%)     1481.40      (5.5%)   -0.9% ( -11% -   11%) 0.471
                    OrNotHighLow     1303.33      (5.9%)     1303.23      (7.1%)   -0.0% ( -12% -   13%) 0.996
           BrowseMonthTaxoFacets       12.49     (25.8%)       12.53     (30.2%)    0.3% ( -44% -   76%) 0.959
                      AndHighMed      331.01      (6.7%)      333.83      (5.6%)    0.9% ( -10% -   14%) 0.537
               HighTermTitleSort       61.03      (3.6%)       61.57      (3.5%)    0.9% (  -5% -    8%) 0.261
                         LowTerm      670.36      (3.9%)      677.32      (4.2%)    1.0% (  -6% -    9%) 0.253
                          Fuzzy1       90.76      (3.3%)       91.82      (3.9%)    1.2% (  -5% -    8%) 0.149
           BrowseMonthSSDVFacets        4.57     (12.5%)        4.63      (7.2%)    1.3% ( -16% -   24%) 0.563
                         MedTerm      481.17      (8.8%)      487.77      (6.1%)    1.4% ( -12% -   17%) 0.416
                       OrHighLow      779.19      (4.3%)      790.11      (4.0%)    1.4% (  -6% -   10%) 0.132
                         Prefix3      410.59      (6.4%)      417.03      (7.0%)    1.6% ( -11% -   15%) 0.296
                       OrHighMed      211.85      (5.2%)      215.23      (6.0%)    1.6% (  -9% -   13%) 0.201
                    OrNotHighMed      376.36      (4.2%)      382.45      (4.3%)    1.6% (  -6% -   10%) 0.089
                   OrHighNotHigh      266.72      (5.5%)      271.13      (5.1%)    1.7% (  -8% -   12%) 0.161
                    HighSpanNear       18.28      (7.7%)       18.59      (8.0%)    1.7% ( -13% -   18%) 0.332
         AndHighMedDayTaxoFacets       47.77      (4.3%)       48.60      (3.3%)    1.7% (  -5% -    9%) 0.042
                          Fuzzy2       72.70      (2.8%)       74.05      (2.9%)    1.9% (  -3% -    7%) 0.004
            HighTermTitleBDVSort       22.00      (7.1%)       22.41      (6.7%)    1.9% ( -11% -   16%) 0.228
                        PKLookup      240.31      (2.0%)      244.88      (2.2%)    1.9% (  -2% -    6%) 0.000
                     AndHighHigh       99.80      (4.0%)      101.75      (2.7%)    2.0% (  -4% -    8%) 0.010
                          IntNRQ       94.21      (5.0%)       96.05      (4.8%)    2.0% (  -7% -   12%) 0.075
                 MedSloppyPhrase       61.87      (3.5%)       63.11      (4.0%)    2.0% (  -5% -    9%) 0.017
                    OrHighNotLow      522.07      (8.3%)      533.24      (8.1%)    2.1% ( -13% -   20%) 0.243
                       LowPhrase       51.26      (3.7%)       52.36      (3.7%)    2.1% (  -5% -    9%) 0.009
                      OrHighHigh       53.23      (8.3%)       54.52      (8.1%)    2.4% ( -12% -   20%) 0.189
            MedTermDayTaxoFacets       13.21      (4.6%)       13.54      (4.9%)    2.5% (  -6% -   12%) 0.020
             MedIntervalsOrdered       48.53      (5.4%)       49.82      (6.3%)    2.7% (  -8% -   15%) 0.042
                HighSloppyPhrase       23.58      (6.1%)       24.22      (7.6%)    2.7% ( -10% -   17%) 0.081
                    OrHighNotMed      422.74      (4.9%)      434.52      (5.2%)    2.8% (  -6% -   13%) 0.014
     BrowseRandomLabelTaxoFacets        4.46      (9.2%)        4.59      (9.0%)    2.8% ( -14% -   23%) 0.168
           HighTermDayOfYearSort      403.10      (7.0%)      414.37      (6.8%)    2.8% ( -10% -   17%) 0.071
       BrowseDayOfYearSSDVFacets        4.55      (8.2%)        4.68      (7.2%)    2.8% ( -11% -   19%) 0.104
                       MedPhrase      382.95      (9.3%)      393.79      (9.2%)    2.8% ( -14% -   23%) 0.172
                         Respell       49.12      (1.2%)       50.54      (2.2%)    2.9% (   0% -    6%) 0.000
               HighTermMonthSort     1514.02      (6.1%)     1557.67      (6.3%)    2.9% (  -8% -   16%) 0.038
                     MedSpanNear       41.96      (5.2%)       43.21      (5.6%)    3.0% (  -7% -   14%) 0.013
                        HighTerm      591.81      (7.2%)      609.51      (6.6%)    3.0% ( -10% -   18%) 0.053
                      HighPhrase       82.89      (6.9%)       85.37      (6.0%)    3.0% (  -9% -   16%) 0.038
                        Wildcard      160.07      (3.4%)      165.02      (3.5%)    3.1% (  -3% -   10%) 0.000
                   OrNotHighHigh      500.91      (7.0%)      516.89      (7.2%)    3.2% ( -10% -   18%) 0.045
     BrowseRandomLabelSSDVFacets        3.31      (3.6%)        3.41      (5.6%)    3.3% (  -5% -   12%) 0.002
             LowIntervalsOrdered        8.36      (8.6%)        8.65      (7.6%)    3.5% ( -11% -   21%) 0.056
                     LowSpanNear       37.83      (2.5%)       39.17      (2.6%)    3.5% (  -1% -    8%) 0.000
        AndHighHighDayTaxoFacets       13.87      (4.8%)       14.39      (6.0%)    3.7% (  -6% -   15%) 0.002
                 LowSloppyPhrase       19.46      (3.5%)       20.20      (5.3%)    3.8% (  -4% -   13%) 0.000
                      TermDTSort      173.31      (6.5%)      180.12      (6.5%)    3.9% (  -8% -   18%) 0.007
            HighIntervalsOrdered       19.36      (6.3%)       20.13      (6.0%)    4.0% (  -7% -   17%) 0.004
       BrowseDayOfYearTaxoFacets        5.46     (11.7%)        5.70     (10.9%)    4.4% ( -16% -   30%) 0.080
            BrowseDateSSDVFacets        1.23      (7.5%)        1.29      (7.1%)    4.7% (  -9% -   20%) 0.004
            BrowseDateTaxoFacets        5.36     (11.4%)        5.62     (11.1%)    4.7% ( -16% -   30%) 0.061
          OrHighMedDayTaxoFacets        4.77      (8.8%)        5.02      (8.9%)    5.2% ( -11% -   25%) 0.008

Fewer volatile reads, less indirection and a fast-path for when there's no executor.
Also, saving some copies, sorting array instead of list and saving
allocations.
.map(LeafReaderContextPartition::createForEntireSegment)
.toList()));
for (List<LeafReaderContextPartition> currentGroup : groupedLeafPartitions) {
slices[upto] = new LeafSlice(currentGroup);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with extracting the partitioning bit to a separate method. Ideally that is made as a single mechanical change though. The diff is hard to diff otherwise.

return res;
}

private synchronized LeafSlice[] computeAndCacheSlices() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ok to me, and potentially simpler than the supplier. Nitpicking, I would prefer that made alone in its own PR. This does not affect when the below error gets thrown compared to before right? It is still thrown the first time the slices are retrieved.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right behavior is completely unchanged. It's still the same amount guarantees around ordering as before :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool can you open a separate PR for it? Easier to review then.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, finally got around to it in #13893 :)

final C collector = collectorManager.newCollector();
collectors.add(collector);
final Weight weight = createWeight(rewrite(query, scoreMode.needsScores()), scoreMode, 1);
if (leafSlices.length == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how I feel about specializing the single slice codepath and having a searchMultipleSlices method. I would prefer to avoid that I think and to share the same codepath between these two scenarios.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm a big fan of having separate paths here. Single slice vs. multiple slice execution simply has an extreme performance impact depending on the query.
With wikimedium in Luceneutil, this is running 8 queries in parallel across 8 threads, slicing into up to 8 slices vs always slicing into a single slice:

                            TaskQPS baseline      StdDevQPS my_modified_version      StdDev                Pct diff p-value
            HighIntervalsOrdered        2.64      (5.0%)        0.64      (1.3%)  -75.7% ( -78% -  -73%) 0.000
                    HighSpanNear        5.97      (4.4%)        2.01      (0.9%)  -66.4% ( -68% -  -63%) 0.000
                 MedSloppyPhrase        7.31      (5.7%)        2.76      (1.6%)  -62.2% ( -65% -  -58%) 0.000
            HighTermTitleBDVSort       12.58      (6.3%)        4.90      (2.4%)  -61.1% ( -65% -  -55%) 0.000
                     MedSpanNear       14.89      (6.4%)        6.93      (1.2%)  -53.4% ( -57% -  -48%) 0.000
             LowIntervalsOrdered       20.75     (11.2%)       10.13      (1.8%)  -51.2% ( -57% -  -43%) 0.000
                          IntNRQ       39.41     (15.3%)       23.34      (8.5%)  -40.8% ( -56% -  -19%) 0.000
             MedIntervalsOrdered       32.83      (8.3%)       19.60      (3.9%)  -40.3% ( -48% -  -30%) 0.000
                HighSloppyPhrase       29.22      (7.4%)       18.36      (2.9%)  -37.2% ( -44% -  -28%) 0.000
          OrHighMedDayTaxoFacets        8.60      (5.6%)        5.63      (3.8%)  -34.5% ( -41% -  -26%) 0.000
        AndHighHighDayTaxoFacets       15.40      (8.5%)       10.55      (2.7%)  -31.5% ( -39% -  -22%) 0.000
         AndHighMedDayTaxoFacets       27.94      (7.9%)       22.57      (2.4%)  -19.2% ( -27% -   -9%) 0.000
                       LowPhrase       45.52     (12.6%)       38.08      (1.8%)  -16.4% ( -27% -   -2%) 0.000
                     AndHighHigh       36.50     (14.6%)       30.68      (5.2%)  -15.9% ( -31% -    4%) 0.000
            MedTermDayTaxoFacets       15.19      (8.4%)       13.38      (5.1%)  -11.9% ( -23% -    1%) 0.000
                      OrHighHigh       48.10     (14.5%)       43.38      (8.4%)   -9.8% ( -28% -   15%) 0.019
            BrowseDateSSDVFacets        1.20      (6.7%)        1.11     (12.1%)   -7.3% ( -24% -   12%) 0.035
           BrowseMonthTaxoFacets       11.02     (38.3%)       10.28     (40.7%)   -6.7% ( -61% -  117%) 0.634
       BrowseDayOfYearTaxoFacets        5.24     (10.1%)        5.03      (7.0%)   -3.9% ( -19% -   14%) 0.200
            BrowseDateTaxoFacets        5.17     (10.2%)        4.96      (7.0%)   -3.9% ( -19% -   14%) 0.207
                     LowSpanNear       62.43      (5.1%)       60.97      (3.1%)   -2.3% ( -10% -    6%) 0.117
                        Wildcard       78.35      (5.8%)       76.52      (2.3%)   -2.3% (  -9% -    6%) 0.137
                       MedPhrase       42.79      (8.5%)       42.43      (2.3%)   -0.8% ( -10% -   10%) 0.702
     BrowseRandomLabelTaxoFacets        4.27      (4.3%)        4.28      (5.3%)    0.4% (  -8% -   10%) 0.824
                        PKLookup      242.91      (1.7%)      245.39      (2.7%)    1.0% (  -3% -    5%) 0.203
                         Respell       45.76      (0.9%)       46.34      (2.1%)    1.3% (  -1% -    4%) 0.025
                         Prefix3      304.37      (1.8%)      308.85      (3.6%)    1.5% (  -3% -    7%) 0.145
                      HighPhrase       32.28      (9.5%)       32.90      (2.5%)    1.9% (  -9% -   15%) 0.434
       BrowseDayOfYearSSDVFacets        4.37      (5.2%)        4.48     (10.1%)    2.5% ( -12% -   18%) 0.371
           BrowseMonthSSDVFacets        4.40      (9.0%)        4.51     (18.2%)    2.6% ( -22% -   32%) 0.612
                          Fuzzy1       92.79      (0.5%)       95.29      (2.3%)    2.7% (   0% -    5%) 0.000
                          Fuzzy2       88.78      (0.7%)       91.62      (2.2%)    3.2% (   0% -    6%) 0.000
                 LowSloppyPhrase       78.35      (3.8%)       80.93      (3.7%)    3.3% (  -4% -   11%) 0.013
                      AndHighMed       55.83      (6.4%)       58.27      (5.7%)    4.4% (  -7% -   17%) 0.041
     BrowseRandomLabelSSDVFacets        3.23      (4.8%)        3.39      (7.4%)    4.8% (  -7% -   17%) 0.028
                       OrHighMed      101.82      (5.0%)      108.12      (6.6%)    6.2% (  -5% -   18%) 0.003
                      AndHighLow      604.08      (3.4%)      654.10      (4.8%)    8.3% (   0% -   17%) 0.000
                       OrHighLow      461.81      (4.5%)      516.22      (4.8%)   11.8% (   2% -   22%) 0.000
                        HighTerm      361.41      (6.6%)      449.94     (12.5%)   24.5% (   5% -   46%) 0.000
                    OrNotHighLow      435.40      (3.6%)      557.47      (7.0%)   28.0% (  16% -   40%) 0.000
                    OrHighNotMed      275.75      (6.1%)      376.17     (12.5%)   36.4% (  16% -   58%) 0.000
                   OrNotHighHigh      183.01      (5.4%)      250.78     (12.3%)   37.0% (  18% -   57%) 0.000
                    OrNotHighMed      273.62      (4.1%)      377.47      (9.9%)   38.0% (  23% -   54%) 0.000
                    OrHighNotLow      290.01      (6.9%)      422.33     (14.0%)   45.6% (  23% -   71%) 0.000
                         LowTerm      610.48      (3.9%)      906.71      (9.5%)   48.5% (  33% -   64%) 0.000
                   OrHighNotHigh      293.17      (5.7%)      438.61     (13.0%)   49.6% (  29% -   72%) 0.000
                         MedTerm      309.52      (6.4%)      544.79     (16.0%)   76.0% (  50% -  105%) 0.000
                      TermDTSort       91.38      (3.3%)      201.06     (16.2%)  120.0% (  97% -  144%) 0.000
               HighTermMonthSort      982.63      (2.5%)     2855.09      (9.7%)  190.6% ( 174% -  207%) 0.000
           HighTermDayOfYearSort      153.03      (4.5%)      609.28     (18.0%)  298.1% ( 263% -  335%) 0.000
               HighTermTitleSort       36.27     (10.4%)      200.12     (18.3%)  451.8% ( 383% -  536%) 0.000

Depending on what kind of query you run, parallelism will either get you a huge boost from better resource utilisation or a huge slowdown from contention and/or redundant work.
I think the two should go through different code paths a. allow for more optimisations down the line and b. make profiling easier (at leat for me personally there is lots of value in knowing that something did or didn't get forked or sliced :)).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants