Compatibility with segment replication #407

dreamer-89 · 2023-06-29T02:13:41Z

Summary

With 2.9.0 release, there are lot of enhancements going in for segment replication[1][2] feature (went GA in 2.7.0), we need to ensure different plugins are compatible with current state of this feature. Previously, we ran tests on plugin repos to verify this compatibility but want plugin owners to be aware of these changes so that required updates (if any) can be made. With 2.10.0 release, remote store feature is going GA which internally uses SEGMENT replication strategy only i.e. it enforces all indices to use SEGMENT replication strategy. So, it is important to validate plugins are compatible with segment replication feature.

What changed

1. Refresh policy behavior

RefreshPolicy.IMMEDIATE will only refresh primary shards but not replica shards immediately. Instead post refresh, primary will start a round of segment replication to update the replica shard copies leading to eventual consistency.
RefreshPolicy.WAIT_UNTIL ensures the indexing operation is searchable in your cluster i.e. RAW (Read after write guarantee). With segment replication, this guarantee is not promised due to delay in replica shared updates from asynchronous background refreshes.

2. Refresh lag on replicas

With segment replication, there is inherent delay in documents to be searchable on replica shard copies. This is due to the fact that replica shard copies over data (segment) files from primary. Thus, compared to document replication, there will be on average increase in amount of time the replica shards are consistent with primaries.

3. System/hidden indices support

With opensearch-project/OpenSearch#8200, system and hidden indices are now supported with SEGMENT replication strategy. We need to ensure there are no bottlenecks which prevents system/hidden indices with segment replication.

Next steps

With segment replication strong reads are not guaranteed. Thus, if the plugin needs strong reads guarantees specially as alternative to change in behavior of refresh policy and lag on replicas (point 1 and 2 above), we need to update search requests to target primary shard only. With opensearch-project/OpenSearch#7375, core now supports primary shards only based search. Please follow documentation for examples and details

Open questions

In case of any questions or issues, please post it in core issue

Reference

[1] Design

[2] Documentation

The text was updated successfully, but these errors were encountered:

dreamer-89 · 2023-06-29T19:51:51Z

Request owners to add v2.9.0 label on this issue.

owaiskazi19 · 2023-07-05T22:29:35Z

-> First we need to check if we use RefreshPolicy.IMMEDIATE or RefreshPolicy.WAIT_UNTIL anywhere in any of our job-scheduler tests? If Yes we need to check if they pass with segment replication.
-> Next check if any of the existing tests fail with segment replication? If they fail, we need to verify if they because replica not caught up to primary? If yes we need to add some wait condition until replica has caught up to primary.
-> Next make sure that we have some tests in job-scheduler repo which create system indices. We need to verify that if system/hidden indices in job scheduler work correctly with segment replication. Before verifying this step make sure this PR: opensearch-project/OpenSearch#8200 is merged. (will be merged soon).
To enable segment replication all you need to do is pass cluster.indices.replication.strategy: 'SEGMENT' in opensearch.yml file. If you are building core tarball locally, I suggest to change this value to ReplicationType.SEGMENT and this value to SEGMENT. Although this is a hack but if you are building core tarball locally this will be simpler.
cc: @Rishikesh1159

dbwiddis · 2023-07-06T16:02:09Z

The only use of any RefreshPolicy in this repo is in the Sample Extension Plugin, in a sample Rest Handler:

job-scheduler/sample-extension-plugin/src/main/java/org/opensearch/jobscheduler/sampleextension/SampleExtensionRestHandler.java

Line 90 in 0132436

.setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);

That REST API is tested here, but it doesn't rely on checking replicas:

job-scheduler/sample-extension-plugin/src/test/java/org/opensearch/jobscheduler/sampleextension/SampleExtensionIntegTestCase.java

Lines 50 to 62 in 0132436

    
           protected SampleJobParameter createWatcherJobWithClient(RestClient client, String jobId, SampleJobParameter jobParameter) 
        
               throws IOException { 
        
               Map<String, String> params = getJobParameterAsMap(jobId, jobParameter); 
        
               Response response = makeRequest(client, "POST", SampleExtensionRestHandler.WATCH_INDEX_URI, params, null); 
        
               Assert.assertEquals("Unable to create a watcher job", RestStatus.OK, RestStatus.fromCode(response.getStatusLine().getStatusCode())); 
        
               Map<String, Object> responseJson = JsonXContent.jsonXContent.createParser( 
        
                   NamedXContentRegistry.EMPTY, 
        
                   LoggingDeprecationHandler.INSTANCE, 
        
                   response.getEntity().getContent() 
        
               ).map(); 
        
               return getJobParameter(client, responseJson.get("_id").toString()); 
        
           }

So I wouldn't expect any failures in the first two steps.

minalsha · 2023-07-06T16:35:47Z

#407 (comment) : "Before verifying this step make sure this PR: opensearch-project/OpenSearch#8200 is merged. (will be merged soon)." @Rishikesh1159 this PR #8200 is not merged yet and today being 7/6. Can you please get to having this PR merged today? cc @anasalkouz

dbwiddis · 2023-07-06T17:08:40Z

Looks like many of the tests in LockServiceIT test updates to the system index for creating or deleting locks, in particular a multi-threaded test testMultiThreadCreateLock() that would likely hit multiple nodes and could fail.

Given the need for these locks to be replicated to be of any use, I'm expecting that this test likely would fail (at least randomly some of the time) and the action needed on this issue is to change the lock acquiring code in LockService to require RefreshPolicy.WAIT_UNTIL.

dbwiddis · 2023-07-06T17:25:19Z

However, the above strategy could be a performance impact, particularly for plugins that use the lock service (I'm thinking Real Time HCAD, @kaituo ):

This could impact performance, though:

However there are perf implications here that need to be called out. SegRep will on average increase the amount of time that replica shards are inconsistent from their primaries due to the time required to copy out segments (time dependent on cluster config/hardware etc). System indices are generally small and should copy out quickly, so I wouldn't expect this to be of major concern. Are there are cases where stronger read/write consistency is desired on the system index? If this is the case plugins would need to prioritize primaries for searches.

Originally posted by @mch2 in opensearch-project/OpenSearch#8182 (comment)

minalsha · 2023-07-06T18:24:53Z

@saratvemulapalli / @joshpalis / @owaiskazi19 can you please see comment: #407 (comment) and share your feedback if the approach is right way to go about? @kaituo please do share your thoughts as well since this would have performance impact.

joshpalis · 2023-07-06T18:30:39Z

Given the need for these locks to be replicated to be of any use, I'm expecting that this test likely would fail (at least randomly some of the time) and the action needed on this issue is to change the lock acquiring code in LockService to require RefreshPolicy.WAIT_UNTIL.

I agree with this approach, but I am doubtful that the performance impact would be significant as the number of documents that would be indexed into this would be relatively small. A lock (document) is indexed prior to the start of a job execution and is the sole lock used for this job (and all subsequent executions of the job).

Rishikesh1159 · 2023-07-06T18:45:51Z

-> First we need to check if we use RefreshPolicy.IMMEDIATE or RefreshPolicy.WAIT_UNTIL anywhere in any of our job-scheduler tests? If Yes we need to check if they pass with segment replication. -> Next check if any of the existing tests fail with segment replication? If they fail, we need to verify if they because replica not caught up to primary? If yes we need to add some wait condition until replica has caught up to primary. -> Next make sure that we have some tests in job-scheduler repo which create system indices. We need to verify that if system/hidden indices in job scheduler work correctly with segment replication. Before verifying this step make sure this PR: opensearch-project/OpenSearch#8200 is merged. (will be merged soon). To enable segment replication all you need to do is pass cluster.indices.replication.strategy: 'SEGMENT' in opensearch.yml file. If you are building core tarball locally, I suggest to change this value to ReplicationType.SEGMENT and this value to SEGMENT. Although this is a hack but if you are building core tarball locally this will be simpler. cc: @Rishikesh1159

sorry there is some correction needed on above comment. To enable segment replication if you are building opensearch core tarball locally and use it with plugin then you need to change this in IndicesService to ReplicationType.SEGMENT, instead I incorrectly pointed you to change in IndexMetadata which is not needed. (Again this is hack to build tarball with segrep enabled quickly. The correct way to enable segrep is still by passing cluster.indices.replication.strategy: 'SEGMENT' in opensearch.yml file)

dbwiddis · 2023-07-06T20:39:16Z

Hey @Rishikesh1159 thanks for the clarity.

I would like for Job Scheduler lock index (system index), to always choose DOCUMENT, overriding any cluster settings if there are any, as discussed in opensearch-project/OpenSearch#8211 (comment)

If so I think this task boils down to:

run tests with cluster set to SEGMENT and see if they fail (they probably will for the lock system index)
change the code for that system index to always set it to DOCUMENT
run tests again and see if they pass

I believe this meets the intent of this issue and also preserves the utility of a performance-sensitive "lock" index having replica consistency as soon as possible.

saratvemulapalli · 2023-07-07T00:34:46Z

Problem

Searching through code for all IndexRequests[1], there are 3 different use cases:

LockService[2]
Downstream Plugins like AD, ISM etc [3]
JobDetailsService[4]

None of Job Scheduler code base uses RefreshPolicy as Dan already called out in #407 (comment).
RefreshPolicy when undefined is set to default as false[5] which essentially means refresh is done at a default interval at some point after the response is sent back.

JobDetailsService is only used to populate the job details index once in a lifetime of extension (newly added feature) and does not really impact any downstream plugins.

LockService will be tricky, since its used by downstream plugins to acquire a lock. The way these jobs are triggered its unlikely that a lock will be acquired within the default refresh interval but we should stress test it. Anyway as @joshpalis said, the number of documents being ingested are tiny in production and I believe we'll not see any impact for JS.

@dreamer-89 @Rishikesh1159 let me know if we missed anything else.

Next Steps

As @dbwiddis suggested, stress test LockService with ITs but it should be ok for us to use SegRep as searches are randomized and limited to number of workers = number of nodes in the cluster with JS installed.
Reach out to downstream plugins and make sure they add integration tests with SegRep for Job indices they configure.

Refs

[1] https://github.com/search?q=repo%3Aopensearch-project%2Fjob-scheduler%20IndexRequest&type=code
[2] https://github.com/opensearch-project/job-scheduler/blob/main/spi/src/main/java/org/opensearch/jobscheduler/spi/utils/LockService.java
[3] https://github.com/search?q=org%3Aopensearch-project%20org.opensearch%3Aopensearch-job-scheduler-spi&type=code
[4] https://github.com/opensearch-project/job-scheduler/blob/main/src/main/java/org/opensearch/jobscheduler/utils/JobDetailsService.java#L24
[5] https://opensearch.org/docs/latest/api-reference/document-apis/index-document/#url-parameters

dbwiddis · 2023-07-07T19:21:04Z

LockService will be tricky, since its used by downstream plugins to acquire a lock. The way these jobs are triggered its unlikely that a lock will be acquired within the default refresh interval but we should stress test it.

Just checked with @mch2 and because we use "Get by ID" that presently assures us the consistency we need, but as I understand from him, may not yet provide those guarantees under SEGREP. I believe he'll be posting a comment here shortly with more details.

dbwiddis · 2023-07-09T21:53:07Z

Comment posted in linked core issue: opensearch-project/OpenSearch#8211 (comment)

My plan to address this issue is:

run the required tests requested by this issue. I expect the multithreaded lock test to fail
force the job lock index to DOCUMENT as a temporary fix until Segment Replication - Support realtime reads with GET requests OpenSearch#8536 is complete
with that change I expect tests will pass

dbwiddis · 2023-07-10T02:53:35Z

Added an Integ test to try to get the same lock and tried many different times to get it to fail in SegRep, without success (well, without failure, but you know what I mean.)

I still defensively added the DOCUMENT index config just in case.

So I think action is complete here.

dbwiddis · 2023-07-11T16:15:31Z

Completed in #417

dreamer-89 added enhancement New feature or request untriaged labels Jun 29, 2023

dreamer-89 mentioned this issue Jun 29, 2023

[Meta] Validate plugins compatibility with segment replication opensearch-project/OpenSearch#8211

Closed

37 tasks

gaiksaya added the v2.9.0 v2.9.0 label Jul 3, 2023

dbwiddis mentioned this issue Jul 10, 2023

Force DOCUMENT replication on lock index #417

Merged

5 tasks

dbwiddis closed this as completed Jul 11, 2023

kaituo mentioned this issue Jul 11, 2023

Compatibility with segment replication opensearch-project/anomaly-detection#936

Closed

dreamer-89 mentioned this issue Jul 12, 2023

Add support for Segrep to Dashboard saved object service opensearch-project/OpenSearch-Dashboards#4522

Closed

dreamer-89 mentioned this issue Aug 22, 2023

Compatibility with segment replication #471

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compatibility with segment replication #407

Compatibility with segment replication #407

dreamer-89 commented Jun 29, 2023 •

edited

Loading

dreamer-89 commented Jun 29, 2023

owaiskazi19 commented Jul 5, 2023 •

edited

Loading

dbwiddis commented Jul 6, 2023

minalsha commented Jul 6, 2023

dbwiddis commented Jul 6, 2023 •

edited

Loading

dbwiddis commented Jul 6, 2023

minalsha commented Jul 6, 2023 •

edited

Loading

joshpalis commented Jul 6, 2023

Rishikesh1159 commented Jul 6, 2023 •

edited

Loading

dbwiddis commented Jul 6, 2023

saratvemulapalli commented Jul 7, 2023 •

edited

Loading

dbwiddis commented Jul 7, 2023

dbwiddis commented Jul 9, 2023

dbwiddis commented Jul 10, 2023

dbwiddis commented Jul 11, 2023

Compatibility with segment replication #407

Compatibility with segment replication #407

Comments

dreamer-89 commented Jun 29, 2023 • edited Loading

Summary

What changed

1. Refresh policy behavior

2. Refresh lag on replicas

3. System/hidden indices support

Next steps

Open questions

Reference

dreamer-89 commented Jun 29, 2023

owaiskazi19 commented Jul 5, 2023 • edited Loading

dbwiddis commented Jul 6, 2023

minalsha commented Jul 6, 2023

dbwiddis commented Jul 6, 2023 • edited Loading

dbwiddis commented Jul 6, 2023

minalsha commented Jul 6, 2023 • edited Loading

joshpalis commented Jul 6, 2023

Rishikesh1159 commented Jul 6, 2023 • edited Loading

dbwiddis commented Jul 6, 2023

saratvemulapalli commented Jul 7, 2023 • edited Loading

Problem

Next Steps

Refs

dbwiddis commented Jul 7, 2023

dbwiddis commented Jul 9, 2023

dbwiddis commented Jul 10, 2023

dbwiddis commented Jul 11, 2023

dreamer-89 commented Jun 29, 2023 •

edited

Loading

owaiskazi19 commented Jul 5, 2023 •

edited

Loading

dbwiddis commented Jul 6, 2023 •

edited

Loading

minalsha commented Jul 6, 2023 •

edited

Loading

Rishikesh1159 commented Jul 6, 2023 •

edited

Loading

saratvemulapalli commented Jul 7, 2023 •

edited

Loading