Debug Module did not become active #1041

erhankur · 2024-04-11T08:11:08Z

We have a new RISCV chip with a dual-core CPU. (ESP32-P4)

The first core(LP) is in sleep mode by default so it's out of topic for now. 2nd core(HP) has 2 harts.

2 cores are connected through a daisy chain scheme where one DTM controls the LP core and the other DTM controls the HP core.

What is the expected way of configuring 2 harts in the same DM as SMP targets? I tried the below config but it it gives an error.

transport select jtag

set _CHIPNAME_0 lp
set _CHIPNAME_1 hp

jtag newtap $_CHIPNAME_0 cpu -irlen 5 -expected-id 0x0012c25
jtag newtap $_CHIPNAME_1 cpu -irlen 5 -expected-id 0x0012c25

set _TAPNAME_0 $_CHIPNAME_0.cpu
set _TAPNAME_1 $_CHIPNAME_1.cpu

set _TARGETNAME_0 $_CHIPNAME_0.cpu
set _TARGETNAME_1 $_CHIPNAME_1.cpu0
set _TARGETNAME_2 $_CHIPNAME_1.cpu1

target create $_TARGETNAME_1 riscv -chain-position $_TAPNAME_1 -coreid 0 -rtos FreeRTOS
target create $_TARGETNAME_2 riscv -chain-position $_TAPNAME_1 -coreid 1 -rtos FreeRTOS
target smp $_TARGETNAME_1 $_TARGETNAME_2

Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : clock speed 10000 kHz
Info : JTAG tap: lp.cpu tap/device found: 0x00012c25 (mfg: 0x612 (Espressif Systems), part: 0x0012, ver: 0x0)
Info : JTAG tap: hp.cpu tap/device found: 0x00012c25 (mfg: 0x612 (Espressif Systems), part: 0x0012, ver: 0x0)
Info : [hp.cpu0] dbgbase=0x0 coreid:0
Info : [hp.cpu0] dtmcontrol=0x70a1
Info : [hp.cpu0] Searching for DM with DMI base address (dbgbase) = 0x0
Info : [hp.cpu0] datacount=1 progbufsize=2
Info : [hp.cpu0] Detected 2 harts.
Info : [hp.cpu0]  XLEN=32, misa=0x40901125
Info : [hp.cpu0] Core 0 made part of halt group 1.
Info : [hp.cpu0] Examined RISC-V core; found 2 harts
Info : [hp.cpu0]  XLEN=32, misa=0x40901125
Info : [hp.cpu0] Examination succeed
Info : [hp.cpu1] dbgbase=0x0 coreid:1
Info : [hp.cpu1] dtmcontrol=0x70a1
Info : [hp.cpu1] Searching for DM with DMI base address (dbgbase) = 0x0
Error: [hp.cpu1] Debug Module did not become active. dmcontrol=0x0
Error: [hp.cpu1] Examination failed
Warn : target hp.cpu1 examination failed

In our fork, we have FreeRTOS SMP support and are mostly aligned with this repo.

log.txt

The text was updated successfully, but these errors were encountered:

TommyMurphyTM1234 · 2024-04-11T08:26:33Z

Does the stuff from Espressif not help?

Edit: ah - sorry - you ARE Espressif! I didn't realise... :-D

erhankur · 2024-04-11T08:52:55Z

We have initial support for the chip, but I haven't been able to configure the SMP properly yet. Before diving in, I'd like to ensure that the riscv target already includes what we need.

en-sc · 2024-04-11T09:24:57Z

@erhankur, I'm a bit confused by the log.Seems like in the version you use logging from dump_field() are disabled on -d3.
(https://github.com/espressif/openocd-esp32/blob/master/src/target/riscv/riscv-013.c#L381-L383)
Please enable the logging in dump_field() and attach a new log (-d4 will be fine) -- without the info about the scans it is quite hard to be sure the issue is what I think it is.

For now It seems this is the same issue as #1024 (TLDR):

The first hart was successfully examined:

Info : 179 14 riscv-013.c:1929 examine(): [hp.cpu0] Debug Module @0x1211206a0
Debug: 180 14 riscv-013.c:495 dmi_scan(): [hp.cpu0] reset_delays_wait done
Debug: 181 14 riscv-013.c:466 increase_dmi_busy_delay(): [hp.cpu0] dtmcs_idle=7, dmi_busy_delay=1, ac_busy_delay=0
Debug: 182 14 riscv-013.c:453 dtmcontrol_scan(): DTMCS: 0x10000 -> 0x7ca1
Debug: 183 15 riscv-013.c:466 increase_dmi_busy_delay(): [hp.cpu0] dtmcs_idle=7, dmi_busy_delay=2, ac_busy_delay=0
Debug: 184 15 riscv-013.c:453 dtmcontrol_scan(): DTMCS: 0x10000 -> 0x7ca1
Info : 185 15 riscv-013.c:1934 examine(): [hp.cpu0] dm->was_reset
Debug: 186 15 riscv-013.c:5174 riscv013_invalidate_cached_progbuf(): [hp.cpu0] Invalidating progbuf cache
Info : 187 15 riscv-013.c:1958 examine(): [hp.cpu0] Debug Module dmcontrol=0x4010001

While examination of the second hart failed:

Info : 266 24 riscv-013.c:1929 examine(): [hp.cpu1] Debug Module @0x1211206a0
Debug: 267 24 riscv-013.c:5174 riscv013_invalidate_cached_progbuf(): [hp.cpu1] Invalidating progbuf cache
Debug: 268 24 riscv-013.c:495 dmi_scan(): [hp.cpu1] reset_delays_wait done
Debug: 269 24 riscv-013.c:466 increase_dmi_busy_delay(): [hp.cpu1] dtmcs_idle=7, dmi_busy_delay=1, ac_busy_delay=0
Debug: 270 24 riscv-013.c:453 dtmcontrol_scan(): DTMCS: 0x10000 -> 0x7ca1
Debug: 271 24 riscv-013.c:466 increase_dmi_busy_delay(): [hp.cpu1] dtmcs_idle=7, dmi_busy_delay=2, ac_busy_delay=0
Debug: 272 24 riscv-013.c:453 dtmcontrol_scan(): DTMCS: 0x10000 -> 0x7ca1
Error: 273 24 riscv-013.c:1954 examine(): [hp.cpu1] Debug Module did not become active. dmcontrol=0x0

It is hard to verify the issue is indeed the same without knowing on which exact scan reset_delays_wait done occurred.

Also, to verify the issue is with the write to dtmcs.dmireset, you can try removing the call to riscv reset_delays as a temporary solution.
If you use riscv-tests/debug it is done here:
https://github.com/riscv-software-src/riscv-tests/blob/fd01e46314636666b5bed1f08465f734751a4cda/debug/testlib.py#L837

erhankur · 2024-04-11T10:27:00Z

@en-sc thanks for the answer. Here is the -d4 log

logd4.txt

en-sc · 2024-04-11T10:57:36Z

Edit: There is an error. Please see #1041 (comment)

Thanks!

This is not the same as #1024.

Seems like an OpenOCD bug:

Debug: 11161 48 riscv-013.c:453 dtmcontrol_scan(): DTMCS: 0x10000 -> 0x7ca1
Debug: 11225 48 riscv-013.c:395 dump_field(): 44b - 00000000 @10 -> + 00000000 @10; 2i
Debug: 11226 48 riscv-013.c:409 dump_field(): read: dmcontrol=0 {}
Debug: 11325 49 riscv-013.c:395 dump_field(): 44b w 00000001 @10 -> + 00000000 @10; 2i
Debug: 11326 49 riscv-013.c:403 dump_field(): write: dmcontrol=1 {dmactive=active}
Debug: 11327 49 riscv-013.c:409 dump_field(): read: dmcontrol=0 {}
...
Debug: 11391 49 riscv-013.c:395 dump_field(): 44b - 00000000 @10 -> + 00000001 @10; 2i
Debug: 11392 49 riscv-013.c:409 dump_field(): read: dmcontrol=1 {dmactive=active}
Error: 11393 49 riscv-013.c:1954 examine(): [hp.cpu1] Debug Module did not become active. dmcontrol=0x0

We have successfully read dmcontrol.dmactive=active but report that Debug Module did not become active.

I have refactored the relevant code recently. Can you try back-porting the commits from #1023 and checking if the issue is still present?

erhankur · 2024-04-11T11:25:18Z

Cherry-picked 3 commits


34d6fe367 (HEAD -> en-sc/check-ac-busy) target/riscv: check `abstractcs.busy`
8319eee9e target/riscv: introduce `examine_dm()` function
67b2a5a95 target/riscv: cache `abstractcs.busy` in `dm013_info_t`

Result is different now.

Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : clock speed 1000 kHz
Info : JTAG tap: lp.cpu tap/device found: 0x00012c25 (mfg: 0x612 (Espressif Systems), part: 0x0012, ver: 0x0)
Info : JTAG tap: hp.cpu tap/device found: 0x00012c25 (mfg: 0x612 (Espressif Systems), part: 0x0012, ver: 0x0)
Info : [hp.cpu0] datacount=1 progbufsize=2
Info : [hp.cpu0] Core 0 made part of halt group 1.
Info : [hp.cpu0] Examined RISC-V core; found 1 harts
Info : [hp.cpu0]  XLEN=32, misa=0x40901125
Info : [hp.cpu0] Examination succeed
Error: OpenOCD only supports Debug Module version 2 (0.13) and 3 (1.0), not 0 (dmstatus=0x0). This error might be caused by a JTAG signal issue. Try reducing the JTAG clock speed.
Error: [hp.cpu1] Examination failed
Warn : target hp.cpu1 examination failed

logd4-2.txt

en-sc · 2024-04-11T12:37:36Z

@erhankur, sorry. I've made an error in the previous analysis.

I'm almost certain this is the same HW issue as #1024.

What was wrong in the first analysis:

Going step by step through the log snippet:

Write all ones to hrtsello & hartselhi to determine hatsellen:

Debug: 10618 46 riscv-013.c:395 dump_field(): 44b w 07ffffc1 @10 -> + 00000000 @32; 0i
Debug: 10619 46 riscv-013.c:403 dump_field(): write: dmcontrol=0x7ffffc1 {dmactive=active hartsello=0x3ff hasel=multiple hartselhi=0x3ff}
Debug: 10620 46 riscv-013.c:409 dump_field(): read:
Debug: 10676 47 riscv-013.c:395 dump_field(): 44b - 00000000 @10 -> b 07ffffc1 @10; 0i

Returns busy. Increase delays.

Debug: 10677 47 riscv-013.c:466 increase_dmi_busy_delay(): [hp.cpu1] dtmcs_idle=7, dmi_busy_delay=1, ac_busy_delay=0
Debug: 10804 47 riscv-013.c:453 dtmcontrol_scan(): DTMCS: 0x10000 -> 0x7ca1
Debug: 10868 47 riscv-013.c:395 dump_field(): 44b - 00000000 @10 -> + 00000000 @10; 1i
Debug: 10869 47 riscv-013.c:409 dump_field(): read: dmcontrol=0 {}

Returns success. Now read back:

Debug: 10968 48 riscv-013.c:395 dump_field(): 44b r 00000000 @10 -> + 00000000 @10; 1i
Debug: 10969 48 riscv-013.c:409 dump_field(): read: dmcontrol=0 {}
Debug: 11033 48 riscv-013.c:395 dump_field(): 44b - 00000000 @10 -> b 00000000 @10; 1i

Returns busy. Increase delays.

Debug: 11034 48 riscv-013.c:466 increase_dmi_busy_delay(): [hp.cpu1] dtmcs_idle=7, dmi_busy_delay=2, ac_busy_delay=0
Debug: 11161 48 riscv-013.c:453 dtmcontrol_scan(): DTMCS: 0x10000 -> 0x7ca1
Debug: 11225 48 riscv-013.c:395 dump_field(): 44b - 00000000 @10 -> + 00000000 @10; 2i
Debug: 11226 48 riscv-013.c:409 dump_field(): read: dmcontrol=0 {}

Returns success. dmcontrol is zero -- HW issue. Seems like the read was aborted by dtmcs.dmireset.
OpenOCD writes dmactive to dmcontrol here. This is meaningless. An error could have been reported earlier and is reported earlier after the refactor in such cases.

Debug: 11325 49 riscv-013.c:395 dump_field(): 44b w 00000001 @10 -> + 00000000 @10; 2i
Debug: 11326 49 riscv-013.c:403 dump_field(): write: dmcontrol=1 {dmactive=active}
Debug: 11327 49 riscv-013.c:409 dump_field(): read: dmcontrol=0 {}
Debug: 11391 49 riscv-013.c:395 dump_field(): 44b - 00000000 @10 -> + 00000001 @10; 2i
Debug: 11392 49 riscv-013.c:409 dump_field(): read: dmcontrol=1 {dmactive=active}
Error: 11393 49 riscv-013.c:1954 examine(): [hp.cpu1] Debug Module did not become active. dmcontrol=0x0

And here is the point where I made an error in the previous analysis. Current operation is a write. So the value returned by the scan is meaningless. But decoding logs read: dmcontrol=1 {dmactive=active}. Here read does not mean the value is read from DM. It means this is the value OpenOCD got from DMI (and that value is garbage in case dmi.op was write on the previous scan). This is exactly how I made the mistake in Debug Module did not become active #1041 (comment).

en-sc · 2024-04-11T12:52:10Z

HW issue being at fault is also confirmed by the second log after cherry-pick.

Here, examination of the DM is skipped the second time, since OpenOCD already learned hartsellen and so on by examining the DM during examination of the first hart.

So now we select the hart (write dmcontrol.hartsello to 1):

Debug: 10114 44 riscv-013.c:400 dump_field(): 44b w 00010001 @10 -> + 00000000 @32; 0i
Debug: 10115 44 riscv-013.c:408 dump_field(): write: dmcontrol=0x10001 {dmactive=active hartsello=1}
Debug: 10116 44 riscv-013.c:414 dump_field(): read:
Debug: 10172 45 riscv-013.c:400 dump_field(): 44b - 00000000 @10 -> b 00010001 @10; 0i

Returned busy. Increase delays.

Debug: 10173 45 riscv-013.c:471 increase_dmi_busy_delay(): [hp.cpu1] dtmcs_idle=7, dmi_busy_delay=1, ac_busy_delay=0
Debug: 10300 45 riscv-013.c:458 dtmcontrol_scan(): DTMCS: 0x10000 -> 0x7ca1
Debug: 10364 45 riscv-013.c:400 dump_field(): 44b - 00000000 @10 -> + 00000000 @10; 1i
Debug: 10365 45 riscv-013.c:414 dump_field(): read: dmcontrol=0 {}

Returned success. Now Invalidate some cache.

Debug: 10366 45 riscv-013.c:5321 riscv013_invalidate_cached_progbuf(): [hp.cpu1] Invalidating progbuf cache

And read dmstatus:

Debug: 10465 45 riscv-013.c:400 dump_field(): 44b r 00000000 @11 -> + 00000000 @10; 1i
Debug: 10466 45 riscv-013.c:414 dump_field(): read: dmcontrol=0 {}
Debug: 10530 46 riscv-013.c:400 dump_field(): 44b - 00000000 @11 -> b 00000000 @11; 1i

Returned busy. Increase delays.

Debug: 10531 46 riscv-013.c:471 increase_dmi_busy_delay(): [hp.cpu1] dtmcs_idle=7, dmi_busy_delay=2, ac_busy_delay=0
Debug: 10658 46 riscv-013.c:458 dtmcontrol_scan(): DTMCS: 0x10000 -> 0x7ca1
Debug: 10722 46 riscv-013.c:400 dump_field(): 44b - 00000000 @11 -> + 00000000 @11; 2i
Debug: 10723 46 riscv-013.c:414 dump_field(): read: dmstatus=0 {}

The same result for a different register. Now after dtmcs.dmireset was written read of dmcontrol was aborted (or at least this is how I see it), and dmstatus is 0 (before patches it was dmcontrol). So an error is returned:

Error: 10724 46 riscv-013.c:809 dmstatus_read_timeout(): OpenOCD only supports Debug Module version 2 (0.13) and 3 (1.0), not 0 (dmstatus=0x0). This error might be caused by a JTAG signal issue. Try reducing the JTAG clock speed.

erhankur · 2024-04-11T13:19:31Z

@en-sc thanks a lot for the analyze. I will confirm the bug with the hw team.

TommyMurphyTM1234 · 2024-04-11T18:34:22Z

The same result for a different register. Now after dtmcs.dmireset was written read of dmcontrol was aborted (or at least this is how I see it), and dmstatus is 0 (before patches it was dmcontrol). So an error is returned:

Just curious - do the two (or more?) RISC-V implementations that seem to have this issue have any sort of common heritage such as using the same upstream HDL implementation by any chance?

erhankur · 2024-04-18T12:22:16Z

The bug was confirmed with the HW team. I am closing the issue. Thanks for helping.

TommyMurphyTM1234 · 2024-04-18T12:41:45Z

The bug was confirmed with the HW team. I am closing the issue. Thanks for helping.

Thanks for the update @erhankur.

Are you at liberty to address the question below or to outline the origin of your RISC-V hardware implementation - e.g. if it's based on some public/open source HDL implementation? It might be useful to know if this problem, which has been reported separately several times, arises from some specific RISC-V implementation.

Just curious - do the two (or more?) RISC-V implementations that seem to have this issue have any sort of common heritage such as using the same upstream HDL implementation by any chance?

erhankur · 2024-04-18T12:55:28Z

@TommyMurphyTM1234 I am sorry, I am not authorized to disclose this information.

TommyMurphyTM1234 · 2024-04-18T14:49:11Z

@TommyMurphyTM1234 I am sorry, I am not authorized to disclose this information.

Ok, thanks.

erhankur closed this as completed Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Debug Module did not become active #1041

Debug Module did not become active #1041

erhankur commented Apr 11, 2024

TommyMurphyTM1234 commented Apr 11, 2024 •

edited

Loading

erhankur commented Apr 11, 2024

en-sc commented Apr 11, 2024

erhankur commented Apr 11, 2024

en-sc commented Apr 11, 2024 •

edited

Loading

erhankur commented Apr 11, 2024

en-sc commented Apr 11, 2024

en-sc commented Apr 11, 2024 •

edited

Loading

erhankur commented Apr 11, 2024

TommyMurphyTM1234 commented Apr 11, 2024

erhankur commented Apr 18, 2024

TommyMurphyTM1234 commented Apr 18, 2024

erhankur commented Apr 18, 2024

TommyMurphyTM1234 commented Apr 18, 2024

Debug Module did not become active #1041

Debug Module did not become active #1041

Comments

erhankur commented Apr 11, 2024

TommyMurphyTM1234 commented Apr 11, 2024 • edited Loading

erhankur commented Apr 11, 2024

en-sc commented Apr 11, 2024

erhankur commented Apr 11, 2024

en-sc commented Apr 11, 2024 • edited Loading

erhankur commented Apr 11, 2024

en-sc commented Apr 11, 2024

en-sc commented Apr 11, 2024 • edited Loading

erhankur commented Apr 11, 2024

TommyMurphyTM1234 commented Apr 11, 2024

erhankur commented Apr 18, 2024

TommyMurphyTM1234 commented Apr 18, 2024

erhankur commented Apr 18, 2024

TommyMurphyTM1234 commented Apr 18, 2024

TommyMurphyTM1234 commented Apr 11, 2024 •

edited

Loading

en-sc commented Apr 11, 2024 •

edited

Loading

en-sc commented Apr 11, 2024 •

edited

Loading