-
Notifications
You must be signed in to change notification settings - Fork 319
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
target/riscv: fix halt reason for targets that do not support hit bit on triggers #1056
Conversation
Related to #1029 |
@JanMatCodasip , @MarekVCodasip could you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch.
It looks good. I just suggest some further changes/improvements.
/* FIXME: this function assumes that we have only one trigger that can | ||
* have hit bit set. Debug spec allows hit bit to bit set if a trigger has | ||
* matched but did not fire. Such targets will receive erroneous results. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In these cases. I think we should clear all hit bits. Since we cannot know which trigger triggered the target halt first and is target dependent. Can you please add it to the FIXME? Or if you really like, you can implement it right away. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what it's worth. hit bits are cleared by OpenOCD already. I've checked this visually. by inspecting the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that the trigger hit bit is cleared on line 1615: https://github.com/riscv-collab/riscv-openocd/pull/1056/files#diff-b4aa16f9e42cb8f0934baa7c8e0ec9c70a369bef98b99b26ae2e896c8aa95d6aR1615
However, the loop is exited when the first hit is found, so subsequent hit bits will not be cleared.
We believe the algorithm should continue clearing the "hit" bits of all triggers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that the trigger hit bit is cleared on line 1615
Actually, I was talking about a different case. When we resume the target, we clear all hit bits anyway, since we temporally disable triggers when single-step is performed. If needed I can provide the exact lines. So technically we are fine for now.
We believe the algorithm should continue clearing the "hit" bits of all triggers.
That being said, I agree that it would not hurt to continue the loop in this case. I'm just not sure if this should be part of a subsequent or this commit - will take a look once again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for explanation. I missed the fact that the temporary disabling of watchpoints on resume or step clears the "hit" bits as a by-product.
I would still prefer the clearing of all hit bits to occur in riscv_trigger_detect_hit_bits()
, because:
- It is very unclear to the code reader that the clearing of "hit" occurs at a different place.
- We could make debug prints when more watchpoints have been hit - for easier troubleshooting.
- The disabling of triggers before step/resume is somewhat questionable because the debugger (GDB, LLDB) is attempting the same operation, so we are often duplicate in OpenOCD what the debugger is already attempting. For that reason, the disabling of triggers on step/resume may one day change - in which case we would lose the "hit" bit clearing (perhaps without even noticing).
In any case, this can be done in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that clearing of hit bits likely should happen in a dedicated function.
I like that proposal. Agreed.
JanMatCodasip: The disabling of triggers before step/resume is somewhat questionable (...)
aap-sc: It's not that questionable, actually. It is a matter of usability/convenience. (...)
I agree that for convenience of TCL users, clearing of the breakpoints is strongly preferred. Side question: Would you know what other targets in OpenOCD do on TCL resume
or step
commands? Do they disable the breakpoints for the user or not? I wonder if the targets behave consistently in this regard. I haven't researched this yet.
What I was concerned (and considered questionable) is that riscv_openocd_step_impl() conditionally disables the current breakpoint (based on handle_breakpoints flag) but unconditionally disables all triggers - that doesn't seem consistent, and looks suspicious at the very least. For that reason, future changes in this part of the code are likely. And so we should not rely on disable_triggers() to clear the hit bits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No need to block the merge request due to this open thread with broader discussion.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you know what other targets in OpenOCD do on TCL resume or step commands? Do they disable the breakpoints for the user or not?
AFAIK most of the targets do that. Last time I checked (at the time I've created #741). These were: arm, aarch64, mips, stm - all the major ones.
I don't see that other targets do any special processing with respect to trigger-like objects (maybe they don't need it? more on that below). Some targets have a code that handles masking of ISR, though.
You can search it like this:
git grep handle_breakpoints | grep if | awk -F: '{ print $1 }' | sort | uniq
What I was concerned (and considered questionable) is that riscv_openocd_step_impl() conditionally disables the current breakpoint (based on handle_breakpoints flag) but unconditionally disables all triggers - that doesn't seem consistent, and looks suspicious at the very least.
The reason is quite simple, I guess. For triggers you can't figure out which instruction is affected . I guess this is the reason. Whether or not other target require such behavior for trigger-like entities is a separate question, since it heavily depends on how single-step is implemented by hw/architecture (like target architecture can explicitly state that watch points are not fired during hw-assisted single step if a certain bit is set). It should be noted that the code that disabled triggers for RISC-V openocd was there for a while and I always took it for granted.
To summarize:
- I think we can conclude that the removal of SW breakpoint at the current address during step/resume is reasonable. Other targets do that too.
- As for other types of triggers... Now that you've mentioned this I kind of share the concern. I think we should at least investigate the behavior for other targets. And I think it would not hurt to ask in OpenOCD mailing list regarding the expected semantics of 'step/resume' for the case of watchpoints. I'll create a separate OS issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the reply and I agree with your conclusions/summary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ticket for further investigations: #1079
eff47de
to
02597d8
Compare
@MarekVCodasip could you take a look once again? |
/* FIXME: this function assumes that we have only one trigger that can | ||
* have hit bit set. Debug spec allows hit bit to bit set if a trigger has | ||
* matched but did not fire. Such targets will receive erroneous results. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see that the trigger hit bit is cleared on line 1615: https://github.com/riscv-collab/riscv-openocd/pull/1056/files#diff-b4aa16f9e42cb8f0934baa7c8e0ec9c70a369bef98b99b26ae2e896c8aa95d6aR1615
However, the loop is exited when the first hit is found, so subsequent hit bits will not be cleared.
We believe the algorithm should continue clearing the "hit" bits of all triggers.
@JanMatCodasip for whatever reason my replies are not displayed properly, so I'll re-post them here. Regarding to your latest comments:
Actually, I was talking about a different case. When we resume the target, we clear all hit bits anyway, since we temporally disable triggers when single-step is performed. If needed I can provide the exact lines. So technically we are fine for now.
That being said, I agree that it would not hurt to continue the loop in this case. I'm just not sure if this should be part of a subsequent or this commit - will take a look once again.
Well, yeah. Probably there are cases when it is useful. Probably, my opinion is way too strong :).
Yes, there are cases when it is terrible (see below). So I want to have an option to avoid these memory reads to reduce invasiveness. For example if we debug ONLY with OpenOCD without involvement of GDB we don't really need this functionality.
The thing is that it may not not work reliably in all cases. For example: I see design that is capable to hang if there is something wrong with the system bus. For example, the underlying device does not send a response after memory reference. We must not forget that people rarely use debugging with OpenOCD when everything is fine. One uses a debugger when something is (terribly) wrong. Reading memory may be considered way to intrusive for some scenarios. |
… on triggers Before this patch the following behavior is observed on targets that do not support hit bit: ``` bp 0x80000004 4 hw resume 0x80000000 riscv.cpu halted due to watchpoint ``` This happens because the current implementation relies on the presence of hit bit way too much. While working on this patch few defects in hit bit-based trigger detection were discovered, added appropriate TODOs.
02597d8
to
2c00a08
Compare
@MarekVCodasip there is a pending change request from your side... @JanMatCodasip any chance you have capability to double-check if we should wait for Marek's reply or we can land the patch as is? |
@aap-sc I have just checked with Marek and he does not have any further comments. Please feel free to go on with the merge. Thanks. |
Before this patch the following behavior is observed on targets that do not support hit bit:
This happens because the current implementation relies on the presence of hit bit way too much. While working on this patch few defects in hit bit-based trigger detection were discovered, added appropriate TODOs.