Replies: 1 comment
-
Feedback from the community meeting:
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Introduction
I have created a sharded output plugin for fluent-bit. The initial goal was to create a way to load balance output among multiple kafka outputs. But instead of creating a new kafka output, or making something kafka-specific, I decided to implement this in a more generic way, allowing any output to be sharded. This requires a very small change to the fluent-bit core, everything else is implementable as a plugin.
I would appreciate feedback on the design and implementation of this, with the goal of merging at least the small core change into fluent-bit itself. I would like to merge the sharded output as well eventually, after finishing the features we want from it, but it could also be maintained in a fork if the functionality is not wanted upstream.
Configuration
To shard output among different outputs, first you define your output shards as normal outputs:
Then you define a sharded output
[OUTPUT] name sharded match * prefix shard.
The aliases for the shards must start with the prefix as configured for the sharded output. The shards should not match anything the sharded output also matches, as this will cause double output. In the example above, the sharded output matches everything, so the shards should match nothing.
How it works
At startup (in cb_pre_run), the sharded output plugin iterates over all defined outputs, and if the alias matches its prefix, it adds the output to its internal list of shards. At the end of this, it sets the "current" shard to the first shard in the list.
When the fluent-bit state machine creates a new task, it finds matching outputs. A small code change in the state machine was made and if the output selector selects a plugin marked with the FLB_PLUGIN_OUTPUT_INDIRECT flag, it calls the output's cb_get_output function to get the real output. I did not name this flag FLB_PLUGIN_OUTPUT_SHARDED as theoretically there could be more plugins that would like to determine outputs programmatically. The sharded output will return its currently selected output and then choose a new one. Currently that choice is a simple round robin, but see below.
The rest of the fluent-bit state machine does not notice the switcheroo that just took place, and the chunk of output gets flushed to the selected shard as normal.
Future shard selection algorithm improvements
The final algorithm I want to implement will have the following attributes:
Feedback mechanism
For that last point, the plugin will need to be able to find out if specific output instances are healthy. In the case of an output using the kafka plugin, it could check that plugin's blocked flag, but there doesn't seem to be a more generic way to get such information. The flb_output_instance struct does have a flush_list attribute that could be used to detect backpressure, but that does not work in multi-threaded mode. Any advice on how to tackle this would be welcome.
Beta Was this translation helpful? Give feedback.
All reactions