Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-4862 - Add odometer sensor process #307

Merged
merged 21 commits into from
Jan 8, 2024

Conversation

kkufieta
Copy link
Contributor

@kkufieta kkufieta commented Dec 11, 2023

Ticket: https://viam.atlassian.net/browse/RSDK-4862

Tested on:

  • Intel Mac w/ lidar
  • Cloud slam (live mode) results show an overall improvement in mapping quality based on this PR. They also show a similar map quality for both IMU and IMU + odometer mapping, though the latter leads to higher confidence sooner than just using an IMU. I could not test mapping using just an odometer because cloud slam won't start unless the data manager is capturing IMU data.

Cloud slam - live mode:
NOTE: Unbeknownst to me, the IMU was providing only zero values. The map quality in the results below is therefore not representative of what the actual performance likely looks like. But the maps do still serve as a proof that adding IMU and odometer data is successful.

main branch - w/ imu this PR - w/ imu this PR - w/ imu & odometer
final result Screenshot 2023-12-14 at 12 40 51 PM Screenshot 2023-12-14 at 12 41 22 PM Screenshot 2023-12-14 at 12 41 39 PM
1st position Screenshot 2023-12-14 at 11 46 46 AM Screenshot 2023-12-14 at 11 56 15 AM Screenshot 2023-12-14 at 11 28 54 AM
2nd position Screenshot 2023-12-14 at 11 47 41 AM Screenshot 2023-12-14 at 11 56 53 AM Screenshot 2023-12-14 at 11 30 30 AM
3rd position Screenshot 2023-12-14 at 11 48 29 AM Screenshot 2023-12-14 at 11 57 54 AM Screenshot 2023-12-14 at 11 31 45 AM
4th position Screenshot 2023-12-14 at 11 49 08 AM Screenshot 2023-12-14 at 11 58 44 AM Screenshot 2023-12-14 at 11 33 02 AM

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 11, 2023
@kkufieta kkufieta added the appimage Build AppImage of PR label Dec 12, 2023
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 19, 2023
@@ -0,0 +1,303 @@
package sensorprocess
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved lidar related tests from sensorprocess_test.go out into this file to make them easier to read.

@@ -0,0 +1,344 @@
package sensorprocess
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved movement sensor related tests from sensorprocess_test.go out into this file to make them easier to read.

"github.com/viamrobotics/viam-cartographer/sensors/inject"
)

func TestStartLidar(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestStartLidar was copy/pasted from sensorprocess_test.go

})
}

func TestAddLidarReadingInOnline(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestAddLidarReadingInOnline was copy/pasted from sensorprocess_test.go, except for the helper functions that are called: I renamed them.

}

t.Run("returns error when lidar GetData returns error, doesn't try to add lidar data", func(t *testing.T) {
invalidAddLidarReadingInOnlineTestHelper(context.Background(), t, cf, config, s.LidarWithErroringFunctions, 10)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalidAddLidarReadingInOnlineTestHelper used to be invalidOnlineModeLidarTestHelper

})
}

func TestTryAddMovementSensorReadingOnce(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestTryAddMovementSensorReadingOnce is copy/pasted from sensorprocess_test.go but was previously named TestTryAddIMUReading. It was altered and augmented.

})
}

func TestTryAddIMUReading(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestTryAddIMUReading was added.

})
}

func TestTryAddOdometerReading(t *testing.T) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TestTryAddOdometerReading was added.

@@ -107,36 +108,36 @@ func (ms *MovementSensor) TimedMovementSensorReading(ctx context.Context) (Timed

timeoutCtx, cancel := context.WithTimeout(ctx, timedMovementSensorReadingTimeout)
defer cancel()
if ms.imuSupported {
imuLoop:
if ms.odometerSupported {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding odometer first, since we want to prioritize it over the imu

@@ -24,7 +24,7 @@ const (
)

var (
defaultTime = time.Time{}
undefinedTime = time.Time{}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this since I kept getting confused with defaultTime, assuming it was set to a specific time, when really it's just used to denote that it was not set to any meaningful value yet.

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 28, 2023
Copy link
Contributor

@jeremyrhyde jeremyrhyde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a first (if detailed) run through of the PR. Looks good in terms of logic and testing. Some nits/opinions/optimizations as well as several questions to be answered.

I also keep coming back to the array sort done in offline mode, I feel like there should be a better wy to do this but I haven't come up with a better solution yet

viam_cartographer.go Outdated Show resolved Hide resolved
sensors/test_deps.go Show resolved Hide resolved
sensors/test_deps.go Show resolved Hide resolved
sensors/test_deps.go Outdated Show resolved Hide resolved
func getFinishedReplayOdometer() *inject.MovementSensor {
odometer := &inject.MovementSensor{}
odometer.PositionFunc = func(ctx context.Context, extra map[string]interface{}) (*geo.Point, float64, error) {
return geo.NewPoint(0, 0), 0.0, replay.ErrEndOfDataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe geo.NewPoint(0,0) be nil for an erroring function?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer math.Nan for things that actually mean something physical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you would want geo.NewPoint(math.Nan, math.Nan)? I suggested nil because the return of Position is a geo.Point pointer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I also just checked RDK and movement sensors that dont implement Position (such as imuwit) returns geo.NewPoint(0,0). This isn't my favorite pattern to support as if an error is returned I'd like the result to be inaccessible and not take up any memory (so nil)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ No, not all; our preferred pattern with an error is math.Nan. We just haven't gotten around to changing them yet. So you're looking at a bad example. Also the ones that truly don't have out ErrUnimplemented returned, so you can't actually get info out of that client call at all.

The one that does implement Position and has an error associated with it is GPS, where if it is erroring, we return math.NaN.

Also the problem here is if you return it as nil, and someone else uses the geoPoint library that we're wrapping on that nil pointer - like position.MidPointTo you get panics all over the place. I'd rather give NaNs.

Copy link

@randhid randhid Dec 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference here is - the witimu can never return position - it doesn't have that data or a chance at returning that data at all, so we use the movement sensor error unimplemented and a PositionReporting: false Property to make sure that doesn't show up in the sdks or the rc card. So you just get an error unimplemented if you try to get any of that information for business level logic out of the wit imu.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the return values to be return geo.NewPoint(math.NaN(), math.NaN()), math.NaN(), replay.ErrEndOfDataset. Does that look good?

If yes, would this also mean we'd want to return structs populated with NaN values instead of empty structs? That is, instead of this:

return r3.Vector{}, replay.ErrEndOfDataset

we'd want to do this?

return r3.Vector{math.NaN, math.NaN, math.NaN}, replay.ErrEndOfDataset

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Since those aren't returned pointers like the *geo.Point return from Position: I'm not as concerned about the r3 library accessing them after they're returned. So it is up to you in this pr.

Is cartographer seeing zero values and taking them as meaningful? That's a question for another day, maybe it has a safeguard against a constant stream of zero data, some algorithms do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBR (To be researched) about carto's handling of constant streams of zeros

sensorprocess/sensorprocess_test.go Show resolved Hide resolved
@@ -120,19 +123,55 @@ func onlineModeLidarTestHelper(
}
}

func onlineModeIMUTestHelper(
func invalidAddLidarReadingInOnlineTestHelper(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[q] what makes this function invalid? The addLidarReadingFunc returns a nil err every time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a test helper, which means that it is being called with "invalid" lidar sensors that will cause addLidarReadingInOnline to fail. addLidarReadingFunc is just there to count the number of times it is being called so we can test at the end that it has not been called at all.

Note: The function was previously named invalidOnlineModeLidarTestHelper

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to remove the invalid if a good lidar COULD be passed into the function and result in valid information. That reduces the confusion for:

  • invalidAddLidarReadingInOnlineTestHelper(invalidLiDAR) -> invalid result
  • invalidAddLidarReadingInOnlineTestHelper(validLiDAR) -> valid result <-------

to

  • addLidarReadingInOnlineTestHelper(invalidLiDAR) -> invalid result
  • addLidarReadingInOnlineTestHelper(validLiDAR) -> valid result

Copy link
Contributor Author

@kkufieta kkufieta Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has to be "invalid" because it actually ensures that things fail. There's another, similar function, where we check that it's valid. Won't change anything now since refactoring this test is not part of the scope for this ticket, but will keep it in mind when I work on integration tests. (note: this test already existed the same way, I just renamed the name function to be more correct)

sensorprocess/testhelper.go Outdated Show resolved Hide resolved
sensorprocess/testhelper.go Outdated Show resolved Hide resolved
sensorprocess/movementsensorprocess.go Show resolved Hide resolved
Copy link

@randhid randhid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments would be good in the places I note, also I'd like the case selector to be a tad bit neater, and why do we need that closed bool?

Edit: I'm not looking at the tests at all, but I trust they're doing good things! Lots of them!

sensorprocess/movementsensorprocess.go Show resolved Hide resolved
sensorprocess/sensorprocess.go Show resolved Hide resolved
sensorprocess/sensorprocess.go Show resolved Hide resolved
subAlgo SubAlgo

muClose sync.Mutex
closed bool
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this doing?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have to tell the C++ layer that the carto facade is closed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is used in the endpoints to indicate that they're called after the Close function has already been initiated, e.g.:

	cartoSvc.muClose.Lock()
	cartoSvcClosed := cartoSvc.closed
	cartoSvc.muClose.Unlock()
	if cartoSvcClosed {
		cartoSvc.logger.Warn("Position called after closed")
		return nil, "", ErrClosed
	}

muClose locks while Close is running and thus blocks all endpoints from attempting to get data from the C++ layer as it's being shut down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note Rand, This came up in an eng talk by Tess six months back. Someone told me that the mod manager should be handling this now, but when I talked to Benji, he had no idea what I was talking about and didn't know if it was being handled so we decided to keep this closed check in, until we know for certain.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a slakc thread about this, it's not mod manager, it's the resource graph that handles removing the resource nodes from the graph (modules included). Its not something other modules do, since if it's closed the client only returns an error for a resource that's not there anymore.

Like if you try to import the resource from the robot after it's closed and not built again, the FromRobot, FromDep call in your code just fails. If it doesn't, that's a netcode bug.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the only case to keep this is if cloud slam is directly importing cartographer methods outside of the resource graph.

For example, gpsnmea has public functions that are imported directly from within rdk by gpstrk (this is something we want to fix but never have bandwidth to), so we have a closed bool there to account for one structure being imported directly outside of the resource graph logic.

If nothing is doing so with this module then it's unnecessary, but I want to be exhaustive about the potential gotchas.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kat, have you seen any of these errors or logs surface at all? Is the C++ layer taking a long time to shut down and thus we are in danger of calling an endpoint directly before the resource graph removes this resource node?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have neither paid attention nor noticed it. Can we create a separate ticket to look into this?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then let's not add code that might not do anything at all and wasn't hurting the implementation, but does seem to be important for anyone reading the code.

Copy link

@randhid randhid Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to your mutexes, okay to leave removing the closed checks to another pr.

viam_cartographer.go Show resolved Hide resolved
viam_cartographer.go Show resolved Hide resolved
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Dec 30, 2023
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jan 3, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jan 3, 2024
config/config.go Outdated Show resolved Hide resolved
Copy link
Contributor

@jeremyrhyde jeremyrhyde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! see one or two last comments. @randhid can decide if we want to remove the closed checks in each endpoints in this PR or a future one.

Co-authored-by: jeremyrhyde <34897732+jeremyrhyde@users.noreply.github.com>
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jan 5, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jan 5, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jan 8, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Jan 8, 2024
@kkufieta kkufieta merged commit 2f92320 into viam-modules:main Jan 8, 2024
7 checks passed
@kkufieta kkufieta deleted the kk/rsdk-4862-sensorprocess-2 branch January 8, 2024 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
appimage Build AppImage of PR safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants