Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

target/riscv: check abstractcs.busy #1023

Merged
merged 3 commits into from
Apr 14, 2024

Conversation

en-sc
Copy link
Collaborator

@en-sc en-sc commented Feb 29, 2024

According to the RISC-V Debug Spec (1.0.0-rc1)[3.7 Abstract Commands]:

While an abstract command is executing (busy in abstractcs is high), a
debugger must not change hartsel, and must not write 1 to haltreq,
resumereq, ackhavereset, setresethaltreq, or clrresethaltreq.

This patch ensures OpenOCD does not violate this rule.

The patch is separated into three commits.

  1. target/riscv: cache abstractcs.busy in dm013_info_t -- provides facilities to track the state of abstractcs.busy
  2. target/riscv: introduce examine_dm() function -- ensures the rule is followed during examination.
  3. target/riscv: check abstractcs.busy -- takes care of other cases.

Copy link
Collaborator

@MarekVCodasip MarekVCodasip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch.

As for the code, I have minor comments about naming.

As for the whole concept of this change:

  • In function examine_dm() we believe with @JanMatCodasip that the check of busy is warranted.
  • In the other functions it looks to us that assert(!ac_busy) would be more appropriate. Because the preceding command should always clean up after itself and leave abstractcs.busy == 0. Let us know what do you think and what was the rationale behind these checks to many places in the code. We don't mind keeping these as is, but still believe asserts are more appropriate in most places.

Thank you.

src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
@JanMatCodasip
Copy link
Collaborator

JanMatCodasip commented Mar 27, 2024

In the other functions it looks to us that assert(!ac_busy) would be more appropriate.

Or, possibly LOG_ERROR("BUG: Unexpected abstractcs.busy ..."); could be used if asserts seem too harsh.

Still, it seems to me that OpenOCD code itself should ensure that DM is left in a clean state (including abstractcs.busy==0) when any command completes. That means should be possible to start a new command without checking if busy==0. (An exception from this rule could be malfunctioning hardware in which abstractcs.busy remains stuck at 1 for some reason, and I am afraid we cannot do much about that in OpenOCD in general.)

If there is a command that can leave the DM in an unclean state, then that place should be fixed. Safety checks at the beginning of other commands seem like a good idea, but please preferably in the form of assertions or LOG_ERROR("BUG: ..."), making it very clear to the reader that these are safety checks that guard against cases that ideally should never happen.

In any case, thank you for the time invested into the patch. For example the refactor of examine is beneficial in its own right 👍

@en-sc
Copy link
Collaborator Author

en-sc commented Mar 28, 2024

  • In the other functions it looks to us that assert(!ac_busy) would be more appropriate. Because the preceding command should always clean up after itself and leave abstractcs.busy == 0.

Currently, in general, working with abstract commands is something along the lines of:

  • Issue a command (there are quite a few ways to do this).
  • Call wait_for_idle().

However, the loop in wait_for_idle() is time-bound, so it can not be guaranteed that after the call to wait_for_idle() abstractcs.busy is clear.

So, let's consider the following OpenOCD session:

  • User writes memory -> the write takes a long time -> wait_for_idle() times out -> an ERROR_TIMEOUT_REACHED is returned.
  • However, the OpenOCD does not exit if a command returns an error.
  • OpenOCD polls targets -> hartsel is changed while abstractcs.busy is high -> the rule is broken by the debugger, not by the HW.

I see a couple of options how to fix this:

  1. Remove the time bound on the loop in wait_for_idle() -- This is bad, since there will be now way to stop OpenOCD in such loop (apart from SIGKILL).
  2. Insert a check at the start of all high-level interface functions -- I have decided against it, since there are quite a of these functions and there is no guarantee that this will be sufficient (one lower level function may set abstractcs.busy and a fallback may not check that abstractcs.busy is clear).
  3. Insert the checks directly in front of prohibited operations -- This approach was selected.

@JanMatCodasip
Copy link
Collaborator

JanMatCodasip commented Apr 9, 2024

@en-sc:

Currently, in general, working with abstract commands is something along the lines of ...

However, the loop in wait_for_idle() is time-bound, so ...

...

Thank you for the explanation. It really helped me to understand the idea behind the proposed changes.

I see a couple of options how to fix this:

1. ...

2. ...

3. Insert the checks directly in front of prohibited operations -- This approach was selected.

For completeness, I can imagine two more options:

  • Leave the current state (the user has been warned about strange behavior of the HW and can restart the session);
  • Un-examine the targets and DM when busy timeout is encountered, preventing further operations until restarted and re-examined.

But anyway, I believe your chosen solution (point 3 above) is fine and I am in favor of it. I have posted several comments with the aim to make the intention behind the changes more clear to the reader of the code. Please let me know if you are OK with the suggestions. I don't expect to have any more review comments than what I have posted so far.

Thank you.

src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
Comment on lines 911 to 919
if (!dm->abtractcs_busy)
LOG_TARGET_ERROR(target, "BUG: dm->abtractcs_busy was not set in time!");
dm->abtractcs_busy = true;

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks to me that this safety check can be moved to the top of the function. I believe it is the caller's responsibility to set dm->abstractcs_busy = true, so we can move this check to the top.

Is that correct understanding? If so, I'd prefer to move this block.

Maybe the error message can also be updated to make it even clearer (if you like):

Suggested change
if (!dm->abtractcs_busy)
LOG_TARGET_ERROR(target, "BUG: dm->abtractcs_busy was not set in time!");
dm->abtractcs_busy = true;
if (!dm->abtractcs_busy)
LOG_TARGET_ERROR(target, "BUG: dm->abtractcs_busy was not set at the start of abstract command");
dm->abtractcs_busy = true;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to leave it as-is.

It is convenient to use wait_for_idle() as a default error handler in case there was a call to a subroutine which may execute an abstract command but may fail before executing the command.

I am not ready to pledge that that there are no such uses currently.

IMHO, it is desirable for wait_for_idle() to be a NOP (apart from one read of abstractcs) in case there was no abstract command execution to begin with.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, understood, thank you for explanation.

Still, could we still change the wording a bit to make it clearer what has happened? For example as shown below:

	if (!dm->abstract_cmd_maybe_busy)
		LOG_TARGET_ERROR(target, "BUG: dm->abstract_cmd_maybe_busy had not been set when starting an abstract command");
	dm->abstract_cmd_maybe_busy = true;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
Comment on lines 2001 to 2023
/* Note, that this does not mean the DM was just reset. */
dm->was_reset = true;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was a bit hard to grasp to me.

Instead, I'd recommend to move the was_reset = true to line 1974 where the reset actually takes place, and the comment can be removed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to leave it as-is.
On the line you mentioned reset of the DM is initiated.
However, we can not be sure the DM successfully came out of the reset.
It can be established only after dmcontrol.dmactive was observed in a value read from the DM, which is exactly what happens here.

Copy link
Collaborator

@JanMatCodasip JanMatCodasip Apr 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, understood.

In that case, how about updating the comment as follows, to make it clearer what the was_reset = true actually means:

Suggested change
/* Note, that this does not mean the DM was just reset. */
dm->was_reset = true;
/* The DM has been reset and has successfully come out of the reset (dmactive=1):
* - either the reset has been performed by this function (above);
* - or the reset had already happened in an earlier call of examine_dm() when
* examining a different hart.
*/
dm->was_reset = true;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
@en-sc en-sc force-pushed the en-sc/check-ac-busy branch 2 times, most recently from 9ea7085 to 8f6965b Compare April 9, 2024 20:42
@en-sc en-sc requested a review from JanMatCodasip April 9, 2024 20:46
JanMatCodasip
JanMatCodasip previously approved these changes Apr 10, 2024
Copy link
Collaborator

@JanMatCodasip JanMatCodasip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me, thank you, approved.

There are only two last threads open that deal with less important items (log prints and comments).

According to the RISC-V Debug Spec (1.0.0-rc1)[3.7 Abstract Commands]:
> While an abstract command is executing (busy in abstractcs is high), a
debugger must not change hartsel, and must not write 1 to haltreq,
resumereq, ackhavereset, setresethaltreq, or clrresethaltreq.

Tracking `abstractcs.busy` allows to enforce this rule.

Change-Id: If5975b48cf9fd379033268145c79103c36fb8134
Signed-off-by: Evgeniy Naydanov <evgeniy.naydanov@syntacore.com>
JanMatCodasip
JanMatCodasip previously approved these changes Apr 11, 2024
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
This allows to examine each DM ones (e.g. enumerating harts assigned to
the DM). Additionaly, it is guaranteed that the DM is reset before the
examination.

Change-Id: I2333d06ff1152bf51c647d59baa55cb402054cb9
Signed-off-by: Evgeniy Naydanov <evgeniy.naydanov@syntacore.com>
According to the RISC-V Debug Spec (1.0.0-rc1)[3.7 Abstract Commands]:
> While an abstract command is executing (busy in abstractcs is high), a
debugger must not change hartsel, and must not write 1 to haltreq,
resumereq, ackhavereset, setresethaltreq, or clrresethaltreq.

The patch ensures the rule is followed.

Change-Id: Id7d363d9fdeb365181b7058e0ceb0be0df39654f
Signed-off-by: Evgeniy Naydanov <evgeniy.naydanov@syntacore.com>
@en-sc en-sc merged commit 740cdc7 into riscv-collab:riscv Apr 14, 2024
4 checks passed
@en-sc en-sc deleted the en-sc/check-ac-busy branch April 14, 2024 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants