Skip to content
This repository has been archived by the owner on Oct 1, 2023. It is now read-only.

CAN motor controllers don't move and report timeout/stale CAN messages #402

Closed
calcmogul opened this issue Mar 15, 2022 · 6 comments
Closed
Labels
bug Something isn't working component: generation Generation of robot projects

Comments

@calcmogul
Copy link
Member

calcmogul commented Mar 15, 2022

We've been getting widespread reports of CAN motor controllers not moving after either SysId binary (C++ with an RT thread) is deployed. It's gotten so bad that people just aren't using the tool, and most teams use CAN motor controllers.

We've confirmed they're using the new roboRIO image, and the console prints messages about CAN message receive timeouts and the messages being stale. The robots under test generally only had like 4 motors installed, so CAN bus overload is unlikely. We've gotten a lot more reports with Falcons than with NEOs, but that could be due to relative marketshare or the proportion of each group of teams that try the tool and mention to us that they had issues.

3512 has gotten stale messages in non-SysId robot code too (C++ with RT thread) with Spark Maxes even after increasing all the non-essential CAN bus status frame periods to 500ms. CAN bus utilization was also low (I wasn't told exactly how much). I don't know if it was ever addressed.

At this point, I think the most likely cause is the vendordeps not playing nicely together since the SysId binary links them all in.

WPILib has no physical hardware, so we have no way to troubleshoot this. One way someone with hardware could try to narrow down the issue is remove vendordeps from sysid-projects and rebuild/redeploy until the problem goes away.

@calcmogul calcmogul added the bug Something isn't working label Mar 15, 2022
@Piphi5
Copy link
Collaborator

Piphi5 commented Mar 15, 2022

I have access to Spark Maxes and NEOs (for a flywheel test) but I haven't been able to reproduce the issue for that hardware with SysId. Is the trend primarily an issue with drivetrain setups?

@calcmogul calcmogul changed the title CAN motor controllers report timeout/stale CAN messages CAN motor controllers don't move and report timeout/stale CAN messages Mar 15, 2022
@calcmogul
Copy link
Member Author

calcmogul commented Mar 15, 2022

Josh (2363) on Discord says they experienced the issue with a 2 NEO shooter. They have 8 NEOs for drivetrain, 2 NEOs for shooter, 1 NEO 550 for hood, 1 NEO 550 for feeder, and 1 Talon SRX for intake.

@calcmogul
Copy link
Member Author

calcmogul commented Mar 15, 2022

3512 has 9 NEOs, and they also make TimedRobot run as real-time at 5ms, like SysId does. The issue didn't occur for 2020/2021 where they had 13 motors (status frame periods were increased when possible), but started occurring in 2022.

@modelmat
Copy link
Contributor

modelmat commented Mar 24, 2022

Have confirmation from team 5032 here that a version of the code (.exe) with only the CTRE vendordep works.

Source for that exe is https://github.com/modelmat/sysid/tree/ctre-only

n.b. a branch with status frame periods modified is https://github.com/modelmat/sysid/tree/status-frames (exe here), but was waiting on a response from 5032 whether they had to quickly test whether it works (EDIT: They didn't). I haven't been able to get in to the workshop myself like I thought I would yet. Also a https://github.com/modelmat/sysid/tree/no-rt branch which caused code to crash (edit 1/4/22: sysid crashed when deploying tests. This seems to be a my team problem; all the dev compiles seem to have sysid crashing on my team. So it might be worth someone else trying)

@trevnels
Copy link

6502 may have seen similar behavior during this past season on Spark MAXes. If I remember correctly, we saw CAN timeouts after deploying sysid, but motors would move when enabled - just in an erratic fashion (as if they were quickly starting and stopping.) Fully power cycling the robot after deploying sysid usually resolved it.

Back then, my hunch was that there was something going wrong during the transition between previously deployed code and sysid which was fixed by a reboot, though I'm not familiar enough with sysid's internals to know if this theory makes sense.

@calcmogul calcmogul added the component: generation Generation of robot projects label Sep 20, 2023
@calcmogul
Copy link
Member Author

OBE by #518. This only ever occurred with the SysId embedded binaries, not user robot programs.

@calcmogul calcmogul closed this as not planned Won't fix, can't repro, duplicate, stale Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working component: generation Generation of robot projects
Projects
None yet
Development

No branches or pull requests

4 participants