-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenFeature integration #131
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: see-quick <maros.orsak159@gmail.com>
Signed-off-by: see-quick <maros.orsak159@gmail.com>
Signed-off-by: see-quick <maros.orsak159@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some comments. But apart from them:
- I think the abiility to configure the feature gates per-cluster si much more important than the ability to configure the feautre gates dynamically.
- I don't understand the focus on Flagd. My assumption was that OpenFeature provides us a generic API for feature flagging that works with many existing systems and allows users to bring their own implementations as well. This PR reads as if it is not the case - should we in that case wait for some actual demand for Flagd first? I think it is fine to use Flagd as an example of how it might work with the other systems. But the proposal seems to hardcode it in.
- I would focus less on including long code segments and more on the actual impact show in shorter examples and/or explained with words.
- The backward compatibility is not clear to me.
084-open-feature-integration.md
Outdated
For the current users (i.e., having classic ENV variable STRIMZI_FEATURE_GATES), we will use [Enviroment Variable Provider](https://github.com/open-feature/java-sdk-contrib/tree/main/providers/env-var) | ||
already implemented by OpenFeature folks. Some of the common characteristics of such provider are: | ||
1. Default method currently used. | ||
2. Feature gates are set via environment variables. | ||
3. Requires rolling updates for changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You link to an existing provider. But how does it provide the backwards compatibility for how we configure the feature gates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply by using just STRIMZI_FEATURE_GATES
env var in all operators as we do it currently (but such ENV would be moved to FeatureGates class).
084-open-feature-integration.md
Outdated
On the other side for the more complex users with need of flagging system, we would support of [FlagD](https://github.com/open-feature/flagd), | ||
which is pretty known because [OpenFeature is CNCF Incubating project](https://www.cncf.io/projects/openfeature/). | ||
Common characteristics of this provider are: | ||
1. New method proposed for more complex users, which needs better management over multiple features (of Strimzi) | ||
2. Integrates with FlagD for dynamic feature flagging | ||
3. Allows changes without the need for rolling updates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the way it would work is that:
- We would have our own default provider for how we configure the feature gates
- Users can use any of the provider (and while we might bundle some, they might also bundle their own)
But you make it sound here like we need to explicitly pick that we are gonna support Flagd?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me, it was the easiest option to do it at the start. But I agree it would be better to let for instance in
private static final String FLAGD_ENABLED_ENV_VAR = "FLAGD_ENABLED"; // Environment variable to toggle FlagD
instead of FLAGD_ENABLED
here use FlagProvider
name and in our code we could add multiple providers based on that ENV-VAR. Here is the list, which is supported by OpenFeature
[1]. So there are plenty of them.
[1] - https://mvnrepository.com/artifact/dev.openfeature.contrib.providers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually, we can have possibly:
/**
* Maps the value of OPEN_FEATURE_PROVIDER_NAME to the corresponding provider.
*
* @return The corresponding FeatureProvider instance based on the environment variable.
*/
private FeatureProvider getProviderFromEnv() throws InvalidOptions {
String providerName = System.getenv(OPEN_FEATURE_PROVIDER_NAME_ENV);
// Default to EnvVarProvider if the environment variable is not set
if (providerName == null || providerName.trim().isEmpty()) {
return new EnvVarProvider();
}
// Create a mapping between the environment variable and providers
Map<String, FeatureProvider> providerMap = new HashMap<>();
providerMap.put("flagd", new FlagdProvider());
providerMap.put("env-var", new EnvVarProvider());
providerMap.put("flagsmith", new FlagsmithProvider());
providerMap.put("configcat", new ConfigCatProvider());
providerMap.put("statsig", new StatsigProvider());
providerMap.put("unleash", new UnleashProvider());
providerMap.put("jsonlogic", new JsonlogicProvider());
providerMap.put("flipt", new FliptProvider());
providerMap.put("go-feature-flag", new GoFeatureFlagProvider());
// Return the corresponding provider or default to EnvVarProvider
return providerMap.getOrDefault(providerName.trim().toLowerCase(), new EnvVarProvider());
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and then everything would be handled by the OpenFeatureAPI Client for every provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so adding new provider will require code change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but currently OpenFeature
has only these providers, which I have listed. So I think it would be just a one-timer change. If in the future a new flagging system comes into the picture (of course then we would need to add it :))
Okay, I will think about how we can achieve it and update the proposal.
You are correct OpenFeature provides a generic API for feature flagging I was thinking that we would start with just FlagD support. But looking at this again it's better to let the user decide what provider he wants to use...So I will update the proposal accordingly to be more generic and not focused only on FlagD. |
Signed-off-by: see-quick <maros.orsak159@gmail.com>
…re gates configuration Signed-off-by: see-quick <maros.orsak159@gmail.com>
Signed-off-by: see-quick <maros.orsak159@gmail.com>
Signed-off-by: see-quick <maros.orsak159@gmail.com>
Signed-off-by: see-quick <maros.orsak159@gmail.com>
Signed-off-by: see-quick <maros.orsak159@gmail.com>
Signed-off-by: see-quick <maros.orsak159@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @see-quick, I left a few comments. Right now although I can see some value I'm not completely convinced the advantages outweigh the disadvantages.
| | | ||
| <----------- API Calls -----------> | | ||
|
||
Where in each component (i.e., ClusterOperator, UserOperator and TopicOperator), we will fetch feature flags dynamically from the OpenFeature API, which is managed by feature flagging server. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in the communication to the server fails? I guess we're unable to proceed with the reconciliation. Thus the availability of the Strimzi operators becomes coupled to the availability of this feature flagging server. I don't think that's a problem for the proposal, but calling out explicitly for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in the communication to the server fails? I guess we're unable to proceed with the reconciliation. Thus the availability of the Strimzi operators becomes coupled to the availability of this feature flagging server
Well this can be handled I think because we would end up in a scenario where:
2024-09-27 09:10:18 ERROR GrpcConnector:149 - failed to connect to event stream, exhausted retries
client (strimzi) wants to reconnect because he is getting a GrpcConnector
error. We can catch that and use just the default value of FEATURE_GATE
; therefore, we no longer depend on feature flagging server availability. And if such error would disappear (feature flagging system is up) we continue with a normal state.
084-open-feature-integration.md
Outdated
- if: | ||
- or: | ||
- and: | ||
- "==": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this and
operator only has a single operand, which is weird. Is this a mistake/bad example, or is it imposed by the CRD validation schema for the FeatureFlag
kind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a mistake/bad example. The correct should be (locally I have fixed in my example files but didn't update proposal thanks):
targeting:
if:
- or:
- "==":
- var: clusterName
- "kafka-cluster-a"
- "==":
- var: clusterName
- "kafka-cluster-b"
- "on"
- "off"
|
||
### Potential Challenges | ||
|
||
- **Complexity:** Increased complexity in configuration management. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the big one for me. If there were lots of users asking for the functionality that open feature provides then it's much easier to justify the complexity. But AFAIK no one is asking for this. That means the benefits are really internal at this point (the testing benefit).
|
||
- **Flexibility:** Users can toggle features without redeploying or restarting services. | ||
- **Faster Iterations/Testing:** Features can be tested and rolled out quickly, speeding up development cycles. | ||
- **Centralized Management:** feature flagging system integration allows centralized control of feature flags, simplifying management across multiple components. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not looked closely at OpenFeature, but externalising the logic used for turning feature gates on and off could be a double edged sword. In particular, if I'm editing a FeatureFlag
how do I know what will be affected? It could be 'nothing', or it could be 'everything' (it's not even limited at just being Strimzi things AFAICS). Does Open Feature help in figuring out the impact, or is it up to the user to piece it together?
I also worry a bit about discoverability: When feature flags are part of the operator deployment it's pretty obvious. When they're in some other resource then you need to know about the existence of FeatureFlag
(as a thing), as well as discovering which instances of it (could there be multiple ones?) the operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not looked closely at OpenFeature, but externalising the logic used for turning feature gates on and off could be a double edged sword. In particular, if I'm editing a FeatureFlag how do I know what will be affected? It could be 'nothing', or it could be 'everything' (it's not even limited at just being Strimzi things AFAICS). Does Open Feature help in figuring out the impact, or is it up to the user to piece it together?
It up to the user to piece it together. little offtopic: Maybe one option would have some examples in DOCs but I am not sure about their maintainability in terms that there are plenty of providers (flagd is just one of them). Another point is that OpenFeature Operator supports just FlagD
. So other providers would need to run in some form of Deployment and they all have different configurations options.
FeatureFlag (as a thing), as well as discovering which instances of it (could there be multiple ones?) the operator
Yeah you can deploy multiple of them AFAIK. Firstly you have to define FeatureFlagSource
e.g.,
apiVersion: core.openfeature.dev/v1beta1
kind: FeatureFlagSource
metadata:
name: strimzi-feature-gates-source
namespace: myproject
labels:
app: open-feature-demo
spec:
sources:
- source: strimzi-feature-gates-flags
provider: kubernetes
And in source you have to reference
your created FeatureFlag CR (i.e., strimzi-feature-gates-flags
). But again this is just for FlagD
and in case of other provider you would need to configure it differently (and without use of Operator cause their operator for now supports just FlagD as feature flagging server).
Signed-off-by: see-quick <maros.orsak159@gmail.com>
This proposal introduces the potential integration of the Strimzi Kafka operator with the feature flagging system
OpenFeature
FlagD. Motivation comes from [1].[1] - strimzi/strimzi-kafka-operator#7520