-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RSDK-4862 - Add odometer sensor process #307
RSDK-4862 - Add odometer sensor process #307
Conversation
@@ -0,0 +1,303 @@ | |||
package sensorprocess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved lidar related tests from sensorprocess_test.go
out into this file to make them easier to read.
@@ -0,0 +1,344 @@ | |||
package sensorprocess |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved movement sensor related tests from sensorprocess_test.go
out into this file to make them easier to read.
"github.com/viamrobotics/viam-cartographer/sensors/inject" | ||
) | ||
|
||
func TestStartLidar(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestStartLidar
was copy/pasted from sensorprocess_test.go
}) | ||
} | ||
|
||
func TestAddLidarReadingInOnline(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestAddLidarReadingInOnline
was copy/pasted from sensorprocess_test.go
, except for the helper functions that are called: I renamed them.
} | ||
|
||
t.Run("returns error when lidar GetData returns error, doesn't try to add lidar data", func(t *testing.T) { | ||
invalidAddLidarReadingInOnlineTestHelper(context.Background(), t, cf, config, s.LidarWithErroringFunctions, 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
invalidAddLidarReadingInOnlineTestHelper
used to be invalidOnlineModeLidarTestHelper
}) | ||
} | ||
|
||
func TestTryAddMovementSensorReadingOnce(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestTryAddMovementSensorReadingOnce
is copy/pasted from sensorprocess_test.go
but was previously named TestTryAddIMUReading
. It was altered and augmented.
}) | ||
} | ||
|
||
func TestTryAddIMUReading(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestTryAddIMUReading
was added.
}) | ||
} | ||
|
||
func TestTryAddOdometerReading(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TestTryAddOdometerReading
was added.
@@ -107,36 +108,36 @@ func (ms *MovementSensor) TimedMovementSensorReading(ctx context.Context) (Timed | |||
|
|||
timeoutCtx, cancel := context.WithTimeout(ctx, timedMovementSensorReadingTimeout) | |||
defer cancel() | |||
if ms.imuSupported { | |||
imuLoop: | |||
if ms.odometerSupported { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding odometer first, since we want to prioritize it over the imu
@@ -24,7 +24,7 @@ const ( | |||
) | |||
|
|||
var ( | |||
defaultTime = time.Time{} | |||
undefinedTime = time.Time{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed this since I kept getting confused with defaultTime
, assuming it was set to a specific time, when really it's just used to denote that it was not set to any meaningful value yet.
dd277d4
to
e25e0a0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a first (if detailed) run through of the PR. Looks good in terms of logic and testing. Some nits/opinions/optimizations as well as several questions to be answered.
I also keep coming back to the array sort done in offline mode, I feel like there should be a better wy to do this but I haven't come up with a better solution yet
sensors/test_deps.go
Outdated
func getFinishedReplayOdometer() *inject.MovementSensor { | ||
odometer := &inject.MovementSensor{} | ||
odometer.PositionFunc = func(ctx context.Context, extra map[string]interface{}) (*geo.Point, float64, error) { | ||
return geo.NewPoint(0, 0), 0.0, replay.ErrEndOfDataset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe geo.NewPoint(0,0)
be nil for an erroring function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer math.Nan
for things that actually mean something physical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you would want geo.NewPoint(math.Nan, math.Nan)
? I suggested nil because the return of Position is a geo.Point
pointer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I also just checked RDK and movement sensors that dont implement Position (such as imuwit) returns geo.NewPoint(0,0). This isn't my favorite pattern to support as if an error is returned I'd like the result to be inaccessible and not take up any memory (so nil)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ No, not all; our preferred pattern with an error is math.Nan
. We just haven't gotten around to changing them yet. So you're looking at a bad example. Also the ones that truly don't have out ErrUnimplemented returned, so you can't actually get info out of that client call at all.
The one that does implement Position and has an error associated with it is GPS, where if it is erroring, we return math.NaN.
Also the problem here is if you return it as nil, and someone else uses the geoPoint library that we're wrapping on that nil pointer - like position.MidPointTo
you get panics all over the place. I'd rather give NaNs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The difference here is - the witimu can never return position - it doesn't have that data or a chance at returning that data at all, so we use the movement sensor error unimplemented and a PositionReporting: false Property to make sure that doesn't show up in the sdks or the rc card. So you just get an error unimplemented if you try to get any of that information for business level logic out of the wit imu.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed the return values to be return geo.NewPoint(math.NaN(), math.NaN()), math.NaN(), replay.ErrEndOfDataset
. Does that look good?
If yes, would this also mean we'd want to return structs populated with NaN
values instead of empty structs? That is, instead of this:
return r3.Vector{}, replay.ErrEndOfDataset
we'd want to do this?
return r3.Vector{math.NaN, math.NaN, math.NaN}, replay.ErrEndOfDataset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^ Since those aren't returned pointers like the *geo.Point return from Position: I'm not as concerned about the r3 library accessing them after they're returned. So it is up to you in this pr.
Is cartographer seeing zero values and taking them as meaningful? That's a question for another day, maybe it has a safeguard against a constant stream of zero data, some algorithms do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBR (To be researched) about carto's handling of constant streams of zeros
@@ -120,19 +123,55 @@ func onlineModeLidarTestHelper( | |||
} | |||
} | |||
|
|||
func onlineModeIMUTestHelper( | |||
func invalidAddLidarReadingInOnlineTestHelper( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[q] what makes this function invalid? The addLidarReadingFunc returns a nil err every time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a test helper, which means that it is being called with "invalid" lidar sensors that will cause addLidarReadingInOnline
to fail. addLidarReadingFunc
is just there to count the number of times it is being called so we can test at the end that it has not been called at all.
Note: The function was previously named invalidOnlineModeLidarTestHelper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to remove the invalid if a good lidar COULD be passed into the function and result in valid information. That reduces the confusion for:
invalidAddLidarReadingInOnlineTestHelper(invalidLiDAR) -> invalid result
invalidAddLidarReadingInOnlineTestHelper(validLiDAR) -> valid result <-------
to
addLidarReadingInOnlineTestHelper(invalidLiDAR) -> invalid result
addLidarReadingInOnlineTestHelper(validLiDAR) -> valid result
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has to be "invalid" because it actually ensures that things fail. There's another, similar function, where we check that it's valid. Won't change anything now since refactoring this test is not part of the scope for this ticket, but will keep it in mind when I work on integration tests. (note: this test already existed the same way, I just renamed the name function to be more correct)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments would be good in the places I note, also I'd like the case selector to be a tad bit neater, and why do we need that closed
bool?
Edit: I'm not looking at the tests at all, but I trust they're doing good things! Lots of them!
viam_cartographer.go
Outdated
subAlgo SubAlgo | ||
|
||
muClose sync.Mutex | ||
closed bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this doing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you have to tell the C++ layer that the carto facade is closed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is used in the endpoints to indicate that they're called after the Close
function has already been initiated, e.g.:
cartoSvc.muClose.Lock()
cartoSvcClosed := cartoSvc.closed
cartoSvc.muClose.Unlock()
if cartoSvcClosed {
cartoSvc.logger.Warn("Position called after closed")
return nil, "", ErrClosed
}
muClose
locks while Close
is running and thus blocks all endpoints from attempting to get data from the C++ layer as it's being shut down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note Rand, This came up in an eng talk by Tess six months back. Someone told me that the mod manager should be handling this now, but when I talked to Benji, he had no idea what I was talking about and didn't know if it was being handled so we decided to keep this closed check in, until we know for certain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a slakc thread about this, it's not mod manager, it's the resource graph that handles removing the resource nodes from the graph (modules included). Its not something other modules do, since if it's closed the client only returns an error for a resource that's not there anymore.
Like if you try to import the resource from the robot after it's closed and not built again, the FromRobot
, FromDep
call in your code just fails. If it doesn't, that's a netcode bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the only case to keep this is if cloud slam is directly importing cartographer methods outside of the resource graph.
For example, gpsnmea
has public functions that are imported directly from within rdk by gpstrk
(this is something we want to fix but never have bandwidth to), so we have a closed bool there to account for one structure being imported directly outside of the resource graph logic.
If nothing is doing so with this module then it's unnecessary, but I want to be exhaustive about the potential gotchas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kat, have you seen any of these errors or logs surface at all? Is the C++ layer taking a long time to shut down and thus we are in danger of calling an endpoint directly before the resource graph removes this resource node?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have neither paid attention nor noticed it. Can we create a separate ticket to look into this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then let's not add code that might not do anything at all and wasn't hurting the implementation, but does seem to be important for anyone reading the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was referring to your mutexes, okay to leave removing the closed checks to another pr.
e25e0a0
to
6419e70
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! see one or two last comments. @randhid can decide if we want to remove the closed
checks in each endpoints in this PR or a future one.
Co-authored-by: jeremyrhyde <34897732+jeremyrhyde@users.noreply.github.com>
Ticket: https://viam.atlassian.net/browse/RSDK-4862
Tested on:
Cloud slam - live mode:
NOTE: Unbeknownst to me, the IMU was providing only zero values. The map quality in the results below is therefore not representative of what the actual performance likely looks like. But the maps do still serve as a proof that adding IMU and odometer data is successful.