-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for active timeout #37834
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
❕ Build Aborted
Expand to view the summary
Build stats
Test stats 🧪
Test errorsExpand to view the tests failures
|
💔 Build Failed
Expand to view the summary
Build stats
Pipeline error
❕ Flaky test reportNo test was executed to be analysed. 🤖 GitHub commentsExpand to view the GitHub comments
To re-run your PR in the CI, just comment with:
|
@elastic/sec-linux-platform could I get some reviews on this PR. TIA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few nitpicks and a potential unit test addition.
Overall, it seems fine to me, not familiar with the code though 🤷
Co-authored-by: Nicholas Berlin <56366649+nicholasberlin@users.noreply.github.com>
// The Flow was terminated because it was considered to be idle. | ||
IdleTimeout | ||
// The Flow was terminated for reporting purposes while it was still active. | ||
ActiveTimeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we implement the logic to add a reason to events when the end of flow is detected (i.e. TCP FIN)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a reason, but not a 100% sure if this is what you had in mind. Please take a look and let me know. Thanks
@@ -213,8 +264,12 @@ func makeWorker(processor *flowsProcessor, tick time.Duration, timeout, period i | |||
if nPeriod <= 0 { | |||
nPeriod = period | |||
} | |||
handleActiveTimeout := enableActiveFlowTimeout && nActiveTimeout == 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is nActiveTimeout
and why might it be 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tick will be gcd of period, timeout and activetimeout
so if period = 10, timeout = 30 and activetimeout = 60
tick = 10
Initial nPeriod = 1
Initial ntimeout = 3
Initial nActiveTimeout = 6
At each tick we decrement these values and then when any of these values reach 0 we either report flow, check for timeout or check for active timeout.
Now if say activetimeout is not set, or period is set to -1
then tick = timeout
nperiod = -1
nActivetimeout = -1
we will never decrement nperiod or nActivetimeout during the ticks, they will never reach 0 so we will never check for period or active timeout.
I added unit tests around this. Maybe that helps a little bit
packetbeat/flows/worker.go
Outdated
} | ||
} | ||
} | ||
|
||
fw.spool.flush() | ||
} | ||
|
||
func (fw *flowsProcessor) report(w *worker, ts time.Time, flow *biFlow, isOver bool, intNames, uintNames, floatNames []string) { | ||
event := createEvent(fw.watcher, ts, flow, isOver, intNames, uintNames, floatNames) | ||
func shouldEndFlow(flow *biFlow, fw *flowsProcessor, ts time.Time, activeFlowTimeout bool) (flowEndReason, bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im a little confused here - what if there are two different timeouts; one for inactive and one for active. how do we account for those two different values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is handled at a different place .That is handled in makeWorker function based on timeout, period, activeTimeout
. However these is a bug I need to change the if condition
if checkTimeout {
to
if checkTimeout || handleActiveTimeout {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ended up actually changing the check altogether
💔 Build Failed
Failed CI StepsHistory
|
💚 Build Succeeded
History
|
💚 Build Succeeded
History
|
💚 Build Succeeded
History
|
💚 Build Succeeded
History
|
💚 Build Succeeded
History
|
One issue we discussed about this change is that because Packetbeat sends bidirectional flow records the accuracy of source and destination are more important compared to a unidirectional flow exporter. If the active flow timeout purges the flow table entry then when the next packet is observed the source and destination might flip because we lost the state about which side initiated the flow. Packetbeat should keep this state. IIUC the goal with adding the active flow timeout is to generate mid-flow events whose packet/byte counts are not cumulative over the life of the flow but are only accumulated since the previous flow event. My initial though was to reset the packet/byte counters but keep the entry in the flow table. But that would change the behavior the the |
@andrewkroh thanks. I think the delta in the intermediate would be helpful for us; however, what would the final record look like? If that is cumalitive, then it would throw off our calculation. Unless we had a very very long idle timeout? Would that help? |
We should probably make the final event also use delta reporting if we create a new delta option. Does that make sense? |
yes, perfect |
closing this in favour of #38223 |
Proposed commit message
Add support for Active timeout. This would kill the flow when timeout is hit irrespective when the last traffic was seen on the flow.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Related issues