Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

target/riscv: fix halt reason for targets that do not support hit bit on triggers #1056

Merged
merged 1 commit into from
Jun 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 88 additions & 13 deletions src/target/riscv/riscv.c
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@

#define DBUS 0x11

#define RISCV_TRIGGER_HIT_NOT_FOUND ((int64_t)-1)

static uint8_t ir_dtmcontrol[4] = {DTMCONTROL};
struct scan_field select_dtmcontrol = {
.in_value = NULL,
Expand Down Expand Up @@ -1550,17 +1552,24 @@ int riscv_remove_watchpoint(struct target *target,
/**
* Look at the trigger hit bits to find out which trigger is the reason we're
* halted. Sets *unique_id to the unique ID of that trigger. If *unique_id is
* ~0, no match was found.
* RISCV_TRIGGER_HIT_NOT_FOUND, no match was found.
*/
static int riscv_hit_trigger_hit_bit(struct target *target, uint32_t *unique_id)

static int riscv_trigger_detect_hit_bits(struct target *target, int64_t *unique_id)
{
/* FIXME: this function assumes that we have only one trigger that can
* have hit bit set. Debug spec allows hit bit to bit set if a trigger has
* matched but did not fire. Such targets will receive erroneous results.
*/
Comment on lines +1560 to +1563
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In these cases. I think we should clear all hit bits. Since we cannot know which trigger triggered the target halt first and is target dependent. Can you please add it to the FIXME? Or if you really like, you can implement it right away. Thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth. hit bits are cleared by OpenOCD already. I've checked this visually. by inspecting the code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that the trigger hit bit is cleared on line 1615: https://github.com/riscv-collab/riscv-openocd/pull/1056/files#diff-b4aa16f9e42cb8f0934baa7c8e0ec9c70a369bef98b99b26ae2e896c8aa95d6aR1615

However, the loop is exited when the first hit is found, so subsequent hit bits will not be cleared.

We believe the algorithm should continue clearing the "hit" bits of all triggers.

Copy link
Collaborator Author

@aap-sc aap-sc May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that the trigger hit bit is cleared on line 1615

Actually, I was talking about a different case. When we resume the target, we clear all hit bits anyway, since we temporally disable triggers when single-step is performed. If needed I can provide the exact lines. So technically we are fine for now.

We believe the algorithm should continue clearing the "hit" bits of all triggers.

That being said, I agree that it would not hurt to continue the loop in this case. I'm just not sure if this should be part of a subsequent or this commit - will take a look once again.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for explanation. I missed the fact that the temporary disabling of watchpoints on resume or step clears the "hit" bits as a by-product.

I would still prefer the clearing of all hit bits to occur in riscv_trigger_detect_hit_bits(), because:

  • It is very unclear to the code reader that the clearing of "hit" occurs at a different place.
  • We could make debug prints when more watchpoints have been hit - for easier troubleshooting.
  • The disabling of triggers before step/resume is somewhat questionable because the debugger (GDB, LLDB) is attempting the same operation, so we are often duplicate in OpenOCD what the debugger is already attempting. For that reason, the disabling of triggers on step/resume may one day change - in which case we would lose the "hit" bit clearing (perhaps without even noticing).

In any case, this can be done in a separate PR.

Copy link
Collaborator

@JanMatCodasip JanMatCodasip Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that clearing of hit bits likely should happen in a dedicated function.

I like that proposal. Agreed.

JanMatCodasip: The disabling of triggers before step/resume is somewhat questionable (...)

aap-sc: It's not that questionable, actually. It is a matter of usability/convenience. (...)

I agree that for convenience of TCL users, clearing of the breakpoints is strongly preferred. Side question: Would you know what other targets in OpenOCD do on TCL resume or step commands? Do they disable the breakpoints for the user or not? I wonder if the targets behave consistently in this regard. I haven't researched this yet.

What I was concerned (and considered questionable) is that riscv_openocd_step_impl() conditionally disables the current breakpoint (based on handle_breakpoints flag) but unconditionally disables all triggers - that doesn't seem consistent, and looks suspicious at the very least. For that reason, future changes in this part of the code are likely. And so we should not rely on disable_triggers() to clear the hit bits.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(No need to block the merge request due to this open thread with broader discussion.)

Copy link
Collaborator Author

@aap-sc aap-sc Jun 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you know what other targets in OpenOCD do on TCL resume or step commands? Do they disable the breakpoints for the user or not?

AFAIK most of the targets do that. Last time I checked (at the time I've created #741). These were: arm, aarch64, mips, stm - all the major ones.

I don't see that other targets do any special processing with respect to trigger-like objects (maybe they don't need it? more on that below). Some targets have a code that handles masking of ISR, though.

You can search it like this:

git grep handle_breakpoints | grep if | awk -F: '{ print $1 }' | sort | uniq

What I was concerned (and considered questionable) is that riscv_openocd_step_impl() conditionally disables the current breakpoint (based on handle_breakpoints flag) but unconditionally disables all triggers - that doesn't seem consistent, and looks suspicious at the very least.

The reason is quite simple, I guess. For triggers you can't figure out which instruction is affected . I guess this is the reason. Whether or not other target require such behavior for trigger-like entities is a separate question, since it heavily depends on how single-step is implemented by hw/architecture (like target architecture can explicitly state that watch points are not fired during hw-assisted single step if a certain bit is set). It should be noted that the code that disabled triggers for RISC-V openocd was there for a while and I always took it for granted.

To summarize:

  1. I think we can conclude that the removal of SW breakpoint at the current address during step/resume is reasonable. Other targets do that too.
  2. As for other types of triggers... Now that you've mentioned this I kind of share the concern. I think we should at least investigate the behavior for other targets. And I think it would not hurt to ask in OpenOCD mailing list regarding the expected semantics of 'step/resume' for the case of watchpoints. I'll create a separate OS issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the reply and I agree with your conclusions/summary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ticket for further investigations: #1079


// FIXME: Add hit bits support detection and caching
RISCV_INFO(r);

riscv_reg_t tselect;
if (riscv_get_register(target, &tselect, GDB_REGNO_TSELECT) != ERROR_OK)
return ERROR_FAIL;

*unique_id = ~0;
*unique_id = RISCV_TRIGGER_HIT_NOT_FOUND;
for (unsigned int i = 0; i < r->trigger_count; i++) {
if (r->trigger_unique_id[i] == -1)
continue;
Expand Down Expand Up @@ -1594,15 +1603,15 @@ static int riscv_hit_trigger_hit_bit(struct target *target, uint32_t *unique_id)
hit_mask = CSR_ETRIGGER_HIT(riscv_xlen(target));
break;
default:
LOG_TARGET_DEBUG(target, "Trigger %d has unknown type %d", i, type);
LOG_TARGET_DEBUG(target, "Trigger %u has unknown type %d", i, type);
continue;
}

/* Note: If we ever use chained triggers, then this logic needs
* to be changed to ignore triggers that are not the last one in
* the chain. */
/* FIXME: this logic needs to be changed to ignore triggers that are not
* the last one in the chain. */
if (tdata1 & hit_mask) {
LOG_TARGET_DEBUG(target, "Trigger %d (unique_id=%d) has hit bit set.", i, r->trigger_unique_id[i]);
LOG_TARGET_DEBUG(target, "Trigger %u (unique_id=%" PRIi64 ") has hit bit set.",
i, r->trigger_unique_id[i]);
if (riscv_set_register(target, GDB_REGNO_TDATA1, tdata1 & ~hit_mask) != ERROR_OK)
return ERROR_FAIL;

Expand Down Expand Up @@ -2268,6 +2277,33 @@ int riscv_flush_registers(struct target *target)
return ERROR_OK;
}

static enum target_debug_reason
derive_debug_reason_without_hitbit(const struct target *target, riscv_reg_t dpc)
{
/* TODO: if we detect that etrigger/itrigger/icount is set, we should
* just report DBG_REASON_UNKNOWN, since we can't disctiguish these
* triggers from BP/WP or from other triggers of such type. However,
* currently this renders existing testsuite as failing. We need to
* fix the testsuite first
*/
// TODO: the code below does not handle context-aware trigger types
for (const struct breakpoint *bp = target->breakpoints; bp; bp = bp->next) {
// TODO: investigate if we need to handle bp length
if (bp->type == BKPT_HARD && bp->is_set && bp->address == dpc) {
// FIXME: bp->linked_brp is uninitialized
if (bp->asid) {
LOG_TARGET_ERROR(target,
"can't derive debug reason for context-aware breakpoint: "
"unique_id = %" PRIu32 ", address = %" TARGET_PRIxADDR
", asid = %" PRIx32 ", linked = %d",
bp->unique_id, bp->address, bp->asid, bp->linked_brp);
return DBG_REASON_UNDEFINED;
}
return DBG_REASON_BREAKPOINT;
}
}
return DBG_REASON_WATCHPOINT;
}
JanMatCodasip marked this conversation as resolved.
Show resolved Hide resolved
/**
* Set OpenOCD's generic debug reason from the RISC-V halt reason.
*/
Expand All @@ -2280,13 +2316,52 @@ static int set_debug_reason(struct target *target, enum riscv_halt_reason halt_r
target->debug_reason = DBG_REASON_BREAKPOINT;
break;
case RISCV_HALT_TRIGGER:
if (riscv_hit_trigger_hit_bit(target, &r->trigger_hit) != ERROR_OK)
target->debug_reason = DBG_REASON_UNDEFINED;
if (riscv_trigger_detect_hit_bits(target, &r->trigger_hit) != ERROR_OK)
return ERROR_FAIL;
target->debug_reason = DBG_REASON_WATCHPOINT;
/* Check if we hit a hardware breakpoint. */
for (struct breakpoint *bp = target->breakpoints; bp; bp = bp->next) {
if (bp->unique_id == r->trigger_hit)
// FIXME: handle multiple hit bits
if (r->trigger_hit != RISCV_TRIGGER_HIT_NOT_FOUND) {
/* We scan for breakpoints first. If no breakpoints are found we still
JanMatCodasip marked this conversation as resolved.
Show resolved Hide resolved
* assume that debug reason is DBG_REASON_BREAKPOINT, unless
* there is a watchpoint match - This is to take
* ETrigger/ITrigger/ICount into account
*/
LOG_TARGET_DEBUG(target,
"Active hit bit is detected, trying to find trigger owner.");
for (struct breakpoint *bp = target->breakpoints; bp; bp = bp->next) {
if (bp->unique_id == r->trigger_hit) {
target->debug_reason = DBG_REASON_BREAKPOINT;
LOG_TARGET_DEBUG(target,
"Breakpoint with unique_id = %" PRIu32 " owns the trigger.",
bp->unique_id);
}
}
if (target->debug_reason == DBG_REASON_UNDEFINED) {
// by default we report all triggers as breakpoints
target->debug_reason = DBG_REASON_BREAKPOINT;
for (struct watchpoint *wp = target->watchpoints; wp; wp = wp->next) {
if (wp->unique_id == r->trigger_hit) {
target->debug_reason = DBG_REASON_WATCHPOINT;
LOG_TARGET_DEBUG(target,
"Watchpoint with unique_id = %" PRIu32 " owns the trigger.",
wp->unique_id);
}
}
}
} else {
LOG_TARGET_DEBUG(target,
"No trigger hit found, deriving debug reason without it.");
riscv_reg_t dpc;
JanMatCodasip marked this conversation as resolved.
Show resolved Hide resolved
if (riscv_get_register(target, &dpc, GDB_REGNO_DPC) != ERROR_OK)
return ERROR_FAIL;
/* Here we don't have the hit bit set (likely, HW does not support it).
* We are trying to guess the state. But here comes the problem:
* if we have etrigger/itrigger/icount raised - we can't really
* distinguish it from the breakpoint or watchpoint. There is not
* much we can do here, except for checking current PC against pending
* breakpoints and hope for the best)
*/
target->debug_reason = derive_debug_reason_without_hitbit(target, dpc);
}
break;
case RISCV_HALT_INTERRUPT:
Expand Down
4 changes: 2 additions & 2 deletions src/target/riscv/riscv.h
Original file line number Diff line number Diff line change
Expand Up @@ -159,11 +159,11 @@ struct riscv_info {
* >= 0: unique_id of the breakpoint/watchpoint that is using it.
* Note that in RTOS mode the triggers are the same across all harts the
* target controls, while otherwise only a single hart is controlled. */
int trigger_unique_id[RISCV_MAX_HWBPS];
int64_t trigger_unique_id[RISCV_MAX_HWBPS];

/* The unique id of the trigger that caused the most recent halt. If the
* most recent halt was not caused by a trigger, then this is -1. */
uint32_t trigger_hit;
int64_t trigger_hit;

/* The number of entries in the program buffer. */
int progbuf_size;
Expand Down