Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jdk17 venice ghci #7

Closed
wants to merge 80 commits into from
Closed

Jdk17 venice ghci #7

wants to merge 80 commits into from

Conversation

sushantmane
Copy link
Owner

Summary, imperative, start upper case, don't end with a period

Resolves #XXX

How was this PR tested?

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

m-nagarajan and others added 30 commits June 26, 2023 14:08
…inkedin#512)

PushJobZstdConfig calculates number of bytes to be collected per input file based on the configurable sample size(default 200MB). If that ends up being less than size of first value in file, no samples get collected from that file potentially leading to skipping zstd dictionary creation. Adding an extra check to add at least 1 sample per input file(until reaching the max sample size).
…edin#508)

In the VPJ, we call the controller 4 times to get the StoreResponse which is wasteful. For configurations, they are unlikely to change during a push job and should be cached to reduce the duplicate call. This PR caches the response so we can reduce the duplicate calls by 2. We will need to do more refactoring to make the VPJ class simple and easy-to-understand
* [server] cleanup removed stores and unused versions during server bootstrap
If server rocksDB directories have resources which are bad and cannot be opened by rocksdb during restart, they are not loaded into storage engine and none of the background resource cleaner kicks for those directories. This leads to massive disk space waste. This PR if rocksDB open fails, it cleans up unused resources.
linkedin#514)

Currently, when creating a D2 client via D2ClientFactory, the code only waits for 5 seconds. This commit increases the timeout to 60 seconds.
…existing Pair usage in controller (linkedin#515)

This PR uses existing POJO class ExecutionStatusWithDetails to replace Pair<ExecutionStatus, Optional> and Pair<ExecutionStatus, String> usage in controller push monitoring logics to improve readability
…etionUnderway (linkedin#518)

The commit removes the `isTopicDeletionUnderway` API from the `PubSubAdminAdapter` class.  
The current default implementation: `ApacheKafkaAdminAdapter::isTopicDeletionUnderway`  
always returns `false`, making it useless. Removing this unnecessary API to improve the clarity  
and maintainability of the codebase.
…din#496)

* [Controller] Emergency source region config should be scoped per cluster to help with testing and can also be useful with complex deployments
* [vpj] minor logging improvements
…lix manager (linkedin#521)

Current behavior: After helix manager initialization, child controllers find
system schema cluster leader controller url from helix manager, build
controller client using the url, register schemas via the controller client.
    
Problem: During deployment, when leader controller A stops, helix assigns
controller B as leader. When A's controller service starts again, helix wants
to hand over leadership to A so it transits B from leader to standby. Before
transition completes, B is still the leader and all A's requests are sent to B.
But since B clears in-memory store map, it throws no store exception to A's
requests. A fails to start.
    
Fix: Before helix manager initialization, child controllers register system
schemas via controller client built using url or d2 configs. Since the
controller instance is still disabled in helix, helix won't hand back
leadership to A and won't transit B from leader to standby. B will handle A's
requests successfully. Also, in case that leadership transits from B to another
controller C, increase the number of retries to cover leader->standby period.
1. Fix a log bug happen during ingestion isolation handover: When the SIT partition is shutdown in time, the old logic still log wrong msg that it does not stop in time and the log is ERROR level. This causes in house deployment tooling to report ERROR logging issues. This PR fixes the logic to only log it when it is actually times out and reduce the log to WARN level.
2. Remove StoreVersionState.toString() logging in handover as this is the underlying SpecificRecord's toString and will print out binary data we store inside SVS. This PR simply remove the logging as we are not interested into the content of the SVS.
linkedin#520)

* [admin-tool] Add support for '--cluster' parameter in data recovery tools

This changes adds support for taking '--cluster' (cluster name) as optional parameter
to several data recovery tools. Because of this change, the '--stores' parameter is
now optional and when both are present, '--stores' overwrites '--cluster'.
…check (linkedin#517)

* [server][da-vinci] Standardize ingestion stats on ready to serve lag measure

The old measure is out of the date with what we actually use for ready to serve. So this will remove false alarms and give more transparency in our metrics.
…e components in the test framework (linkedin#523)

Sometimes to construct pubsub clients we need to pass additional details about the pubsub brokers.    
This PR will let us pass these details to the pubsub clients in integration testing env.
…edin#516)

* [controller][test] Fix a race condition between kill job and version swap

Today there's a race around kill job and push completion or version swap. This is because we only write the kill message and
do nothing and rely on the servers to receive the kill, report ERROR to mark a version killed, and prevent it from becoming
current/active. However, this is not always the case especially when the push strategy only requires n-1 replicas to be
completed in every partition. This means if a kill is sent when a push is near completion or almost completed, it can still
become ONLINE and current version after the kill was sent and recognized by some SNs. This is an issue because we now have a
current version that is also having a kill message in the participant store. No re-balance or re-bootstrap can succeed for
this store version. Current mitigation is to re-push or rollback to previous version when this race happens.

A prerequisite rb (linkedin#269) that introduces the new VersionStatus.KILLED has deployed in production allows us to continue with the
following fix.

This fix includes:

1. On kill, we take a store lock (since current version swap will also be acquiring this lock) when checking if the version
   to-be-killed is still in a non-complete state. Update the version status on ZK to KILLED, if it is.
2. On version swap, we check if the version to which we are swapping has been killed already by checking the VersionStatus.
3. An integration test with mocks to verify the correctness of the above behavior.
* [davinci][ii] Remove zk usage from II for davinci stores

User libraries should not access zk.  Additional work could be done to remove zk completely from the sub project, but it would seem the way things are architected currently that could be quite a lot of work, and we need to make some incremental progress here.

* fix test

* add localhost util

EOM

* Tweak test.  This isn't great though, the intent is to use two processes.

* Revert "Tweak test.  This isn't great though, the intent is to use two processes."

This reverts commit 6aa73d8.

* Final fix for testing

* Remove commented code

* sanity check

* More sanity check

* Fix port assignment in integration test framework to make cluster initializer not advertize a non connected port.

* some clean up

---------

Co-authored-by: Zac Policzer <zpolicze@zpolicze-mn1.linkedin.biz>
…n#525)

* Ignore slow node logic during retry when 5XX encountered so retry can still request nodes from the slow node set
The DIV state could grow unboudedly since it tracks some metadata
for each producer, and never cleared them out. This is problematic
for use cases where a hybrid store is RT-only, meaning it doesn't
rev up store-versions, and as the servers are bounced overtime,
the leader replicas generate new GUIDs overtime, which linger in
the DIV state forever. In extreme cases, this leads to OOM errors.

This change mitigates the unbounded growth by introducing a max
age, beyond which DIV metadata can be evicted. This is controlled
by a new config, which is disabled by default:

div.producer.state.max.age.ms (default: -1)

The clearing logic is inserted in the updateOffsetRecordForPartition
and function of KafkaDataIntegrityValidator, which is itself called
prior to each sync (both as part of ingestion when a threshold is
reached, and during graceful shutdown). It is also called in the
setPartitionState function of the same class, which is called when
subscribing to a partition, including when the server starts up.

The age of a producer is determined by the most recent timestamp
within a partition (computed by OffsetRecord::getMaxMessageTimeInMs)
minus the most recent timestamp of that producer (which is the
Segments' lastRecordProducerTimestamp). These timestamps can be
subject to drift but in practice it probably isn't going to drift
enough to be a concern. Importantly, because the math is entirely
within a single partition, this means one partition can be lagging
(e.g. as would be the case in a rebootstrap scenario) while another
may be caught up, and each partition will independently clear their
own states without contaminating each other.

This property of isolating the partitions' expired state clearing
is achieved by inverting the KafkaDataIntegrityValidator's internal
data structure. Previously, it contained a Map<GUID, ProducerTracker>
with each PT containing one Segment per tracked partition. Now, it
instead contains a SparseConcurrentList of PartitionTracker (which
is essentially the same as ProducerTracker, but renamed), each of
which contains a Map<GUID, Segment>. This is a more natural structure
that aligns with other parts of the code as well. It also results in
eliminating 2 map lookups from the DIV hot path (called 1/msg in
followers, and 2/msg in leaders).

Note that currently the state clearing happens only in the "main"
KafkaDataIntegrityValidator, used by drainers, and which persist
their state to disk. There is another KafkaDataIntegrityValidator
used for leader replicas only, which track the RT topic's DIV state,
but that one is transient and not persisted to disk. As such, it is
not expected to grow as large, and bouncing effectively clears it
(which is not the case for the one which gets persisted). Later on,
we could trigger state cleaning for this other validator as well.

Miscellaneous changes:

Removed the RedundantExceptionFilter::getRedundantExceptionFilter
with parameters, since that is broken by design... this and the
non-parameterized functions both returned a singleton object, so
the actual parameters of the singleton were driven by the timing
of which code path got called first, which makes no sense.

Moved ProducerTracker (renamed as PartitionTracker) and its unit
test from venice-common to dvc, since it is only used in the latter.

StoreIngestionTaskTest::testDataValidationCheckPointing no longer
has invocationCount = 3, as this is a pretty long test, and it
doesn't seem useful.

Introduced a ReentrantLockWithRef, a new utility class which
extends ReentrantLock and also carries a ref to another object.
This is used in PartitionTracker, where previously we would need
to query two maps, one to get the lock, and the other to get the
protected object. Since we were always operating on the same key
of both maps, we could instead group the elements into this shared
container, hosted in a single map.

Also removed several unnecessary objects from Segment. A server
heap dump was found containing >3M instances of this class, so
making it more efficient should be beneficial (i.e. even when DIV
state clearing is not enabled yet).

Checksum::getInstance does not return an Optional anymore.

Reduced string concatenation in several places.
…t server start time (linkedin#531)

When a server starts, if system schema initialization is enabled, it will call
controller to check and register new meta system store schemas. Therefore,
servers can be deployed before controllers when meta system store schema is
upgraded. Controller url (for open source users who do not want to onboard d2)
or d2 configs must be provided to enable system schema initialization at start
time.
…nice components (linkedin#538)

This commit adds support for passing non-mergeable broker details to the Venice components. This  
means that if there is a key that is present in the broker details for multiple brokers, only the value for   
the last broker will be kept. Previously, all values for the same key would be concatenated together.
…linkedin#535)

This PR aims to clean up deprecated value and RMD chunks to avoid storage engine chunking leaks. When we processed a partial update request, it will also try to look up old value and RMD manifest and will produce DELETE messages to old value and RMD chunks to remove them.
Note:

As discussed offline, this solution will be a short term solution to clean up disk space, but it will not clean up Kafka version topic garbage chunks as even after log compression there will be one DELETE message left per deprecated chunk key, but we think this is acceptable at this stage as DELETE message is considered small and we can work on long term complete solution later.
Based on current implementation, we could not fully delete a RMD of a key. We can only insert an empty RMD into it. (If this is not acceptable, we will need to add new implementations to execute real RocksDB delete on RMD Column Family.
We refactor the chunking get method to store the retrieved value's manifest if it is chunked. That means for AASIT, we will always delete old RMD chunks (as it is always fetched for every operation) and we will do best effort to delete old value's RMD (in theory, we will always do it for UPDATE). For LFSIT, we will always delete value chunks for UPDATE operation.
…ory` for integration test and venice components. (linkedin#493)

This change supports plugging in PubSubClientFactory from server config and controller config. This will enable configuration passing specific combination of PubSubClients. For example, we may have admin factory and producer factory for one type of PubSub Client, consumer factory is in other type. This change will also benefit Ingestion Isolation and DaVinci client with specific PubSub Client.

---------

Co-authored-by: Hao Xu <xhao@xhao-mn2.linkedin.biz>
…n#534)

Read-compute functionality was available to select users and required support from service operators to ensure safety and stability for all users. This change makes the read-compute interface available to all users by leveraging batch-get and local computations by default. This allows service operators to enable/disable remote computations without disrupting user applications.

The order of operations is as follows:
Upon start, the client sends a compute request and sets a new HTTP header VENICE_CLIENT_COMPUTE to indicate that client-side computation is supported.
The router checks store config, and when remote computation is off, fails the request if client-computation is not available. Otherwise, the router converts the compute request into a batch-get.
The client checks HTTP headers to resolve what kind of response it got, and if necessary performs computations locally.
The following computation requests are converted to batch-get on the client side to avoid extra work and sending extra data over the network.
Batch-get response always carries a header indicating the latest availability of remote computation so that the client can switch back and forth if needed (e.g., when service operator makes changes to store config).

Summary of changes:
* Moved common computation logic from server and da-vinci-client to ComputeUtils to reduce code duplication.
* Merged com.linkedin.venice.utils.ComputeUtils and com.linkedin.venice.compute.ComputeOperationUtils into
com.linkedin.venice.compute.ComputeUtils.
* Moved AbstractAvroStoreClient:: StoreClientStreamingCallback  out to reduce client complexity and maintain independency between transport and client logic. The new decoder layer is responsible for deserializing records from stream of bytes.
* Added storage_engine_read_compute_efficiency server metric to help monitor data compression due to read compute.
* Added multiget_fallback metric to both client and router to monitor client-computation activity.
…#533)

RocksDB offers a set of multiget API besides single-get API, and
it claims that it should improve the performance by parallelizing
the IO operations with linux kernal 5.1 or later.
We also want to benchmark these APIs for PT format since in theory,
it should reduce the cost of JNI because of less JNI calls.

Based on the benchmarking, the performance of multiget API is performing
worse than single-get API (which is weird), and we would like to keep
these benchmarks and code, so that we can evaluate BT format in the future.

PT benchmark result (subset):
Benchmark                                                   (BATCH_SIZE)  Mode  Cnt      Score      Error  Units
RocksDBLookupApiBenchmark.measureMultiGetAPI                           1  avgt    5  18195.987 ± 1484.887  ms/op
RocksDBLookupApiBenchmark.measureMultiGetAPI                           2  avgt    5  16774.263 ±  212.736  ms/op
RocksDBLookupApiBenchmark.measureMultiGetAPI                          10  avgt    5  18206.964 ± 3331.436  ms/op
RocksDBLookupApiBenchmark.measureMultiGetAPIWithByteBuffer             1  avgt    5  27395.865 ±  688.723  ms/op
RocksDBLookupApiBenchmark.measureMultiGetAPIWithByteBuffer             2  avgt    5  23750.618 ± 4607.772  ms/op
RocksDBLookupApiBenchmark.measureMultiGetAPIWithByteBuffer            10  avgt    5  15060.939 ±  941.485  ms/op
RocksDBLookupApiBenchmark.measureSingleGetAPI                          1  avgt    5   8542.114 ±  378.653  ms/op
… operations (linkedin#537)

* [controller] Update superset schema ID after all value schema related operations

Existing implementation updates the superset schema id as soon as we add a new value or superset schema.
This is problematic because in parent controller we perform schema addition in the following order:
1. Add new value schema (superset schema id is also updated).
2. Check if A/A is enabled and update RMD schema.
3. Check if WC is enabled and update derived schema.

This execution order allows race to happen when superset schema id is updated but the corresponding RMD schema
and derived schema are not available yet. Resulting in failure in downstream consumers like the Venice storage
node during ingestion.

The fix is to perform superset schema id as the last step and separate it into its own update store admin message.
This was flagged when reviewing 6c8b672 and left as a followup.

The historical context is that there was a temporary phase of the
A/A migration where RT producers would be replicated to all regions
and so servers would consume duplicate events concurrently, even
some written by the same producer GUID. This led to race conditions
in the DIV code, and so this locking was introduced to guard
against it. The locking was retained as part of the aforementioned
commit, but is not actually necessary anymore, so we're cleaning
it up. This is expected to provide a small boost to ingestion
performance as well as to the memory efficiency of DIV, especially
in cases where there are lots of producer GUIDs involved.

Miscellaneous:

- Tweaked the VeniceSystemFactoryTest::testSerializationCastParams
  data provider so that it returns parameters with a deterministic
  toString output, which is necessary for the test-retry plugin to
  work well.
…child controller client (linkedin#539)

To resolve KME upgrade deployment order, one action is to make child controllers
and servers register KME from the code into local system store at startup time.
However, KME is only backward compatible. Adding schema that is not fully
compatible via controller client is not supported. This commit supports adding
value schema with id and compat type via child controller client.
…inkedin#542)

* [vpj] Reuse KafkaConsumer when building dict from an existing topic

Previously, `KafkaInputDictTrainer` would initiate a new KafkaConsumer
per partition, which results in ton of KafkaConsumer related logs printed
in VPJ log, which would make VPJ log analyzing less efficient.
This code change will reuse the same consumer when extracting data messages
from different partitions to reduce the total amount of logs produced
by KafkaInputDictTrainer.

* Addressed comment
* [controller] Cleanup Helix disabled partition map

During ingestion error in leader, we add disabled partition map config in helix. But even if those resources are dropped later, the disabled map not cleared, which later might cause znode to blow up. This PR will cleanup disabled partition map (by calling enable, which will clear the node) during dropping of resources.


Co-authored-by: Sourav Maji <tester@linkedin.com>
FelixGV and others added 28 commits August 7, 2023 10:05
…kedin#558)

Prior to this commit, the ComputeRequestWrapper class was used
across client, router and server, but these components all had
different needs and the end result was a bit awkward, as it led
to the class needing to be constructed in a way that was not
fully initialized, only to have the rest of the init happen
later by mutating its internal state. This seemed error-prone.

Furthermore, the class attempted to serve as a way to evolve
the read compute wire protocol, and although we did evolve read
compute, the way we evolved it was already compatible anyway,
and if we had evolved it in less compatible ways, it's not
clear that the original structure of this class would have been
good enough to support such evolution.

In this commit, we are cleaning all of that up and simplifying
many aspects of the code in the process. Notably:

- ComputeRequestWrapper is now only used in the client, and it
  is fully initialized at construction-time, with all internal
  state final.

- The read compute protocol version passed in a header is still
  checked when parsing an incoming request, but then it's not
  carried into deeper code paths, where it was essentially
  ignored and useless.

- Created a SchemaAndToString container class to replace some
  usages of Pair<Schema, String>.

- Added a ComputeRequest specific record, which is the same as
  ComputeRequestV[1-4] except that the items inside the list of
  operations are strongly-typed, rather than objects. We do
  still need to keep (at least one of) the other schemas around
  since those are the ones the clients use on the wire, but we
  can cheaply convert between those and the new one, and so the
  server code need not do as much casting as before.

- In the router, the VeniceComputePath does not hang on to a
  ComputeRequestWrapper anymore, which it didn't need to. It
  merely skips over the bytes represented by the compute
  request, with as little allocation as possible. It also hangs
  on to the raw String headers it needs rather than convert
  them to int and then back to strings moments later. Made all
  state final.
Introduced pub-sub client exception classes for uniform exception handling within PubSubAdminAdapter    
implementations. By adopting these exceptions, TopicManager can now base its retry logic on pub-sub   
client exceptions rather than system-specific exceptions. This will improve error handling and reliability.     

The commit also includes the following improvements:  
- Comprehensive Javadoc documentation outlining the behavior of APIs in PubSubAdminAdapter and   PubSubProducerAdapter  
- Addition of extensive unit tests for the ApacheKafkaAdminAdapter implementation  
- Structural changes and code cleanups
… clients (linkedin#570)

Currently, if new clients would like to add support for different POJO formats in their interfaces, they need to override the "initSerializer" function to configure the serializers to serialize the POJO to Avro. The base function also configures the serializers for "compute" and "batchget" requests. So, new clients would have to specify serializers for both these explicitly.

This commit reduces the requirements for new clients to only specify the key serializers and they get "batchget" support by default.
…allback when store is not A/A or partial update enabled (linkedin#571)

Resolves excessive logging found in internal certification flow. This newly added log line will be logged for store that does not have A/A or partial update enabled. This PR fixes the issue by only log it when TR cache is used.
…t construction (linkedin#575)

Currently, we extract Kafka client properties from VeniceProperties based on prefixes.   
However, this approach sometimes  leads to passing irrelevant properties to clients,    
as these properties may contain settings intended for other client types. When   
irrelevant properties are passed to Kafka clients they log a message which ends up     
polluting logs and can cause confusion.   

Fixed this issue by adding an additional property check to ensure their validity for  
the specific client being constructed.
…inkedin#573)

To start a router on a specific port, passing `LISTENER_PORT` configuration via   
`extraProperties` could accidentally cause it to be utilized by the VeniceServer   
which could lead to port conflicts. To resolve this, introduced  a separate config   
for starting a router on a designated port. This fix ensures that the port    
assignment remains distinct and does not interfere with other components.  

Additional improvements:
- Removed unused classes related to Zk client.  
- Removed a log statement that was polluting the logs.
For stores without pushstatus system store, controller logs get flooded with full excception stacktrace. Also in server most of the time RMD file sst writer are empty, so do not print any warn log for it.
Co-authored-by: Sourav Maji <tester@linkedin.com>
…iter configs (linkedin#576)

This commit allows the users of online producer to specify arbitrary configs for the underlying VeniceWriter
* [controller][test] Fix issues in data recovery API

This rb targets several issues in the data recovery API:

1. Today, during data recovery, only when the version number from the source equals to the current version in the dest colo,
we bump a store's version to LargestUsedVersionNumber+1 in parent and use the increased version for data recovery.
So, if the version number from the source is less than the current version in the dest, we simply use it in the dest.
As a result, data recovery can not complete successfully, because backup version cleanup thread can kick in, consider it
as an expired backup version, and delet it.

As a fix to it, we should extends current condition, and increase the store version for such cases as well.

On the other hand, if the version number from the source colo is larger than current version in dest colo, we can use it for
the dest colo without increasing it. This is fine because it is larger than the current, and resources for this version will
be deleted during prepareDataRecovery. So, even though the version number is used (created and then deleted) in dest colo
before, it can still be reused.

2. Users sometimes can put the same colo name for src and dest which is considered as an error operation.
This rb adds guardrail to protect such cases and fails early.

Integration tests are also improved to cover the above two issues and verify the correctness of this fix.
…adata update bug and add schema evolution support (linkedin#560)

* [venice-client][venice-admin-tool] Fixed a bug where we only update FC metadata on new store version

1. The problematic behavior was that metadata update would only be applied in FC if the fetched metadata
belongs to a new store version. This behavior is unexpected because the routing info can change for
the same store version (e.g. replica reblanced).

The same behavior is found for HAR info and it is also fixed to be updated everytime. Using atomic
reference to an object that bundles the map and list to avoid inconsistent access when updating HAR info.

2. Added new Admin Tool command to get the request based metadata payload from a given server.

3. Added full schema evolution support for MetadataResponseRecord. Backwards compatibility is ensured by adding a
new MetadataResponseRecordCompatibilityTest. Forward compatibility is supported using
RouterBackedSchemaReader and schema system store.

This means we need to have a few routers around in the cluster to perform new schema discovery when we evolve the MetadataResponseRecord.

4. Fixed a bug in server where we used version number instead of the entire resource name (store + version number) to get routing table for request based metadata.
…din#582)

This change reverts a bunch of changes that were made in PR linkedin#472.

There was a deadlock in production when users used the getKeySchema method on the client when initial client warm up failed and an async warm up was executing
Many classes were located in venice-client-common, yet were only used
within DVC. This commit moves these classes to DVC. Ideally, they
should eventually end up in server, but since DVC itself is tangled
with server code, that is difficult to refactor in one shot. At least
for now venice-client-common will be less bloated, and we'll reduce
the likelihood of future tangling to continue.

Miscellaneous:

- Renamed several ID to Id in variables and method names, at the
  request of Spotbugs.

- Removed the --stacktrace argument from the CI, as it makes it harder
  to scroll to the root cause of a build failure, and typically does
  not contain any useful information (but rather, only Gradle internal
  details).

- Also enabled MergeGenericRecordTest::testUpdate which was previously
  disabled. Had to fix it up a bit so it could run.

- Introduced a deserializer cache in VeniceRmdTTLFilter to get rid
  of some code which was instantiating MapOrderPreservingDeserializer
  instances on the hot path. Map order preservation is only necessary
  within the server when we will deserialize and then reserialize the
  payloads, but that was not the case in the TTL filter.

Module-level test branch coverage changes:

- venice-client-common: 37.76% to 44.33%
- da-vinci-client: 48.69% to 51.7%
…er test (linkedin#581)

* [da-vinci] Added one more config to simplify the DaVinci memory limiter test

For some DaVinci customers, they want to validate DaVinci memory limiter feature
without affecting prod in canary instance, and currently, it is not possible
since DaVinci memory limiter will apply globally.
To simplify the test, this code change introduces a new config:
ingestion.memory.limit.store.list : default empty

When it is specified, only the stores configured will enforce memory
limiter, and if it is not specified (default), all the stores in the
same DaVinci instance will enforce memory limiter.

With this way, the customer can specify a test store to test DaVinci
memory limiter feature, and when the bootstrapping of test store fails
because of memory exhaust, only the push job to the test store will fail.

* fixed test failures

* Fixed test failures

* addressed comment
During target colo push for stores with DaVinci client, or during data recovery, Venice uses data recovery client endpoints to enable push to rest of the colos, but for DaVinci client running in follower mode does not have the configs needed for data recovery which leads to DaVinci host ingestion failure.
This PR skip data recovery code path in Davinci clients.

---------

Co-authored-by: Sourav Maji <tester@linkedin.com>
…inkedin#532)

* FC batch get uses instanceHealthMonitor to check for healthy instances before sending our the post request, but doesn't update the health monitor if an instance ends up being healthy/unhealthy: This change reuses the methods which the single get uses and its counted per route requests, not per key.
* FC (both single/batch gets) doesn't increment metrics when the instance is not reachable: Currently in such scenarios, a future is completed based on routingLeakedRequestCleanupThresholdMS to update the health data, but it doesn't complete the value future and the stats future. Made changes such that we end up completing these futures and added tests.
* When both original and retry requests fail and the stats future is completed, only the unhealthy count and latency metric is incremented, not providing any insights into the reasons (eg: noAvailableReplica) and other insights like whether retries happened in the request. Modified that flow to increment a few more metrics for such failure cases.
* cleaned up metric verification for some tests
* [vpj] Send pushjob details during post-target pushjob
The push job details are generated and send through controller for various offline tracking of pushjob health. During post canary colo job, it missed to send out pushjob details to the controller, thus it would appear in pushjob detail store that other colo did not finish the pushjob. This PR fixes that issue.

---------

Co-authored-by: Sourav Maji <tester@linkedin.com>
Replaced several lookups by field name to use field position instead in
write compute code paths. There are probably more such opportunities
left. This is just one pass for cases where the safety of the usage is
more obvious.

Also moved CollectionTimestampBuilder to tests since it was meant from
the start to be a test-only class.
…er tool (linkedin#590)

The CLI tools to read ("QueryTool") and write ("ProducerTool") data to Venice do not handle complex types currently. This commit adds the support to read and write data where the store has complex data types in the schema.
… transition (linkedin#578)

Previously in O/B/O mode, there is an explicit delay from OFFLINE to DROPPED transition;
we missed this delay when building L/F mode. This PR added the delay back, and the delay only applies
to current version resources.

Add a stat to track number of threads being blocked from OFFLINE to DROPPED transition.
We see some host contained in /CONFIG/PARTICIPANT path but not in /INSTANCE of a cluster . If queried a list of instances using ZKHelixAdmin::getInstancesInCluster and called getDisabledPartitionsMap which throws HelixException. This PR tries to skip such exception to continue cleaning the helix resources. Also bumps the Helix dependency to 1.1.0 which would resolve the exception log in zk node deletion code path.
---------

Co-authored-by: Sourav Maji <tester@linkedin.com>
Previously admin tool construct ApacheKafkaConsumerAdpater without relying on PubSubConsumerFactory#create(), then bootstrap servers url property cannot be correctly translated into kafka property. In this change, using factory to create this adapter class to leverage the config translation functionality and unifying the factory usage of admin tool and naming.

Co-authored-by: Hao Xu <xhao@xhao-mn2.linkedin.biz>
…ock dependency incompatible issues. (linkedin#595)

Previous refactoring removed this constructor, it has caused downstream dependency issue. To unblock this we add back the constructor and mark it deprecated, as we want to let downstream to use the other constructor to pass specific PubSubClientFactory.

Co-authored-by: Hao Xu <xhao@xhao-mn2.linkedin.biz>
…inkedin#565)

- Adapts Netty Channel pipeline and corresponding handler classes to be compatible with gRPC requests
- Can collect server side statistics and enforce read quota for gRPC requests
- Introduce mutual TLS support on gRPC enabled servers and fast client, when an SSLFactory is present
- Update unit tests and integration tests
* [server][admin-tool] Improve Server Ingestion State Dump

This code change adds a new command to dump ingestion state,
also Venice Server will skip the ACL check for admin request endpoint
since there is no PII data.
In the meantime, this code change will skip compression dict in the
response since it is not readable and string-format the `Fixed` type.

* Addressed unit test coverage issue

* Fixed a comment

* Fixed unit test coverage failure
…icDumper (linkedin#599)

Recently, we've added support to emit DELETE messages to clean up leaked chunks (in PR linkedin#535) but we didn't handle this in the topic dumper tool
@sushantmane sushantmane deleted the jdk17-venice-ghci branch October 21, 2023 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.