Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MLT occasionally sends additional TriggerDecisions to the DFO after it completes the 'disable_triggers' transition #250

Closed
bieryAtFnal opened this issue Nov 24, 2023 · 5 comments
Labels
miscellaneous deliverable A change that is/will be part of a release but is not substantial enough to be a daq-deliverable

Comments

@bieryAtFnal
Copy link
Contributor

bieryAtFnal commented Nov 24, 2023

This should not happen. The meaning of the 'disable_triggers' command/transition is to stop sending TriggerDecisions.

I suspect that this is caused by the MLT sending out 'pending' TDs at 'stop' time, and my sense is that it should instead do that at 'disable_triggers' time. But, I'll leave the full diagnosis and fix to the experts.

Here are TRACE messages that provide some insight into what is happening...

Below, in reverse time order, are TRACE messages that show the MLT pausing triggers (lines 13552-13554); the DFO completing the “stop” transition (line 13549); the MLT flushing pending TDs, sending a TD to the DFO, and completing the “stop” transition (lines 13451-13456), and the DFO complaining about a TD with the wrong run number at the start of the next run (line 13412).

Recall that run control commands are broadcast to all DAQ applications without any deliberate ordering. So, it is not surprising that the DFO could complete “stop” before the MLT does. That is exactly why the “disable_triggers” command/transition was added (to ensure that the flow of TriggerDecisions are stopped before the DFO is told to stop the run).

13412 11-23 20:17:46.058799 672746 673582 78 DataFlowOrchestrator ERR receive_trigger_decision: DataFlowOrchestrator encountered run number mismatch: recvd (101) != 102 from MLT for trigger_number 462
13418 11-23 20:17:46.058168 672746 673061 120 DataFlowOrchestrator LOG add_callback_impl: Registering callback.
13419 11-23 20:17:46.058004 672746 673061 120 DataFlowOrchestrator LOG add_callback_impl: Registering callback.
13451 11-23 20:17:43.217897 672967 673077 32 ModuleLevelTrigger NFO do_stop: End of run 101
13453 11-23 20:17:43.217178 672967 673077 32 ModuleLevelTrigger LOG do_stop: LivetimeCounter - total deadtime+paused: 2396
13454 11-23 20:17:43.216914 672967 673335 44 ModuleLevelTrigger LOG send_trigger_decisions: Run 101: Received 494 TCs. Sent 462 TDs consisting of 462 TCs. 29 TDs (29 TCs) were created during pause, and 0 TDs (0 TCs) were inhibited. 0 TDs (0 TCs) were dropped.
13455 11-23 20:17:43.211346 672967 673077 32 ModuleLevelTrigger LOG call_tc_decision: Sending a decision with triggernumber 462 timestamp 79554163409484103 number of links 20 based on TC of type 5
13456 11-23 20:17:43.211339 672967 673077 32 ModuleLevelTrigger D03 call_tc_decision: Override?: 1
13457 11-23 20:17:43.211326 672967 673077 32 ModuleLevelTrigger D03 create_decision: HSI passthrough: 0, TC detid: 15, TC type: 5, TC cont number: 1, DECISION trigger type: 1, DECISION timestamp: 79554163409484103, request window begin: 79554163409484103, request window end: 79554163409484135
13458 11-23 20:17:43.211286 672967 673077 32 ModuleLevelTrigger D03 create_decision: earliest TC index: 0
13459 11-23 20:17:43.210549 672967 673077 32 ModuleLevelTrigger D03 flush_td_vectors: Flushing TDs. Size: 1
13516 11-23 20:17:42.295068 672967 673335 44 ModuleLevelTrigger D03 call_tc_decision: Override?: 0
13517 11-23 20:17:42.295065 672967 673335 44 ModuleLevelTrigger D03 create_decision: HSI passthrough: 0, TC detid: 15, TC type: 5, TC cont number: 1, DECISION trigger type: 1, DECISION timestamp: 79554163405562183, request window begin: 79554163405562183, request window end: 79554163405562215
13518 11-23 20:17:42.295064 672967 673335 44 ModuleLevelTrigger D03 create_decision: earliest TC index: 0
13519 11-23 20:17:42.254728 672967 673335 44 ModuleLevelTrigger D03 call_tc_decision: Override?: 0
13520 11-23 20:17:42.254723 672967 673335 44 ModuleLevelTrigger D03 create_decision: HSI passthrough: 0, TC detid: 15, TC type: 5, TC cont number: 1, DECISION trigger type: 1, DECISION timestamp: 79554163402746183, request window begin: 79554163402746183, request window end: 79554163402746215
13521 11-23 20:17:42.254705 672967 673335 44 ModuleLevelTrigger D03 create_decision: earliest TC index: 0
13537 11-23 20:17:42.204360 672967 673335 44 ModuleLevelTrigger D03 call_tc_decision: Override?: 0
13538 11-23 20:17:42.204357 672967 673335 44 ModuleLevelTrigger D03 create_decision: HSI passthrough: 0, TC detid: 15, TC type: 5, TC cont number: 1, DECISION trigger type: 1, DECISION timestamp: 79554163399930183, request window begin: 79554163399930183, request window end: 79554163399930215
13539 11-23 20:17:42.204342 672967 673335 44 ModuleLevelTrigger D03 create_decision: earliest TC index: 0
13549 11-23 20:17:42.194916 672746 673061 120 DataFlowOrchestrator LOG do_stop: dfo successfully stopped
13552 11-23 20:17:42.071710 672967 673077 99 ModuleLevelTrigger D03 do_pause: TS End: 1700792262071710
13553 11-23 20:17:42.071678 672967 673077 99 ModuleLevelTrigger NFO do_pause: Trigger is paused
13554 11-23 20:17:42.071589 672967 673077 99 ModuleLevelTrigger LOG do_pause: ******* Triggers PAUSED! in run 101 *********

@bieryAtFnal bieryAtFnal changed the title MLT occasionally sends additional TriggerDecisions to the DFO after it completes the 'pause-triggers' transition MLT occasionally sends additional TriggerDecisions to the DFO after it completes the 'disable_triggers' transition Nov 26, 2023
@bieryAtFnal
Copy link
Contributor Author

Apologies, I mistakenly called the transition in question 'pause-triggers' when I should have written 'disable_triggers'. I have edited existing text.

@bieryAtFnal bieryAtFnal added the miscellaneous deliverable A change that is/will be part of a release but is not substantial enough to be a daq-deliverable label Dec 26, 2023
@bieryAtFnal
Copy link
Contributor Author

For reference, I want to mention that Pull Request 249 was originally filed to address this Issue. However, the candidate code changes contained in that PR did not fix the problem. That PR has been closed, but I'm referencing it here because there may be comments in it that are helpful in understanding the underlying problem, should we ever need to go back and remember all of the details.

@bieryAtFnal
Copy link
Contributor Author

Another comment that I would like to add to this Issue is that we have created an automated integration test that quite reliably demonstrates the problem. This integtest makes use of the configuration suggestions that Michal posted in PR 249, and it is available on the kbiery/integtest_for_the_disable_trigger_transition branch in the trigger repo.

@MRiganSUSX
Copy link
Contributor

Fixed with above mentioned PR (#262).

@bieryAtFnal
Copy link
Contributor Author

Closing this Issue now that it has been addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
miscellaneous deliverable A change that is/will be part of a release but is not substantial enough to be a daq-deliverable
Projects
None yet
Development

No branches or pull requests

2 participants