Connection Timeout Issue #6637
Replies: 5 comments 8 replies
-
Hi All, Any suggestions please |
Beta Was this translation helpful? Give feedback.
-
We are facing the same issue. fluent-bit version is 2.0.5 [2023/01/12 06:08:36] [error] [upstream] connection #174 to tcp://x.x.x.x:24224 timed out after 10 seconds (connection timeout) |
Beta Was this translation helpful? Give feedback.
-
We are still facing the same issue after upgrading to v2.0.8. |
Beta Was this translation helpful? Give feedback.
-
We're are seeing a large amount of connection timeouts and no recovery without any recover with fluent-bit 1.9.10 build.
if we run a check manually it works.
looks like its stuck records are coming in but nothing is being delivered i the splunk output...
After deleting the pod that ahs these issues it seems to work (same fluent-bit config) again so there must be a issue with fluent-bit getting stuck.
we though these were resolved via this issue: #4505. but it seems that there are still problems overall. Can we confirm that the fix is in v2.1.3? Looking at the commits in |
Beta Was this translation helpful? Give feedback.
-
I encountered this problem again, this time in version 3.0.7, where the logs are sent to Splunk
Does anyone have any idea what could be the cause, looks like this connection is broken only for one Splunk instance, another Splunks with the same version and configuration works properly. Reset doesn't help. |
Beta Was this translation helpful? Give feedback.
-
Hi All,
I am using Fluent Bit version 2.0.5 running as a daemon set in Kubernetes clusters. The K8s env I am having at my organisation are K8s clusters at edge and around 100 clusters are there. I am getting the issue with the Fluent-Bit forwarding the logs to NewRelic, for few of the clusters that the FluentBit pod got stuck in connection timeout and not able to get any logs beyond that.
When checked we are getting the FluentBit uptime metrices which is being scrapped through prometheus. Since FluentBit image is a distroless one so even Liveness Probe can't be set to restart the pod in case if we are not getting the metric.
Connection timeout I can understand could be a network issue but pod getting stuck to timed out and not restarting or dropping the connection is an issue. Also I have checked the PR #3192 which says the issue of connection timeout is fixed in v1.7.0 and ownwards but still issue persists.
Please suggest any resolution for the same.
Beta Was this translation helpful? Give feedback.
All reactions