Dedicated Predictor Object #2366

Turakar · 2023-06-17T16:08:51Z

Turakar
Jun 17, 2023

Recently, I wrote a custom ExactGP implementation for some special sparsity structure I exploited. This uses an alternative approach to the caching in the ExactPredictionStrategy GPyTorch currently uses: For prediction, you first create a new prediction object which can then be used like a normal module:

predictor = model.predictor()
prediction = predictor(x)

Here, predictor caches all the necessary values and keeps an internal reference to model, which makes GPU memory usage more predictable (related to #1619, #1787, #1859). This could also easily be extended to support #567. This way, one can decide when to create the posterior cache, what it should contain, and how long to keep it.

I personally like this API more, but this might be a matter of preference. I do of course understand that just calling model(x) for predictions might be more intuitive, but IMHO Gaussian Processes are not as easy as this. Keeping track of posterior caches is very important for GPU memory usage and latency during prediction.

This is not necessarily a breaking change - it should be possible to support both APIs simultaneously via a cached predictor object in model, created if model() is called in eval mode.

One drawback is the fact that model must always be called in eval mode from predictor, either via a shallow copy, keeping model in eval mode or a temporary switch on every call. Although a little bit hacky, I would prefer the shallow copy (i.e. like predictor.model.load_state_dict(model.state_dict())).

In the end, this design decision is likely a matter of taste.

Balandat · 2023-06-17T16:54:48Z

Balandat
Jun 17, 2023
Maintainer

I think that makes sense. It's somewhat related to the Posterior concept we have in BoTorch. There model.posterior(X) will construct a Posterior object that can then be evaluated / sampled from etc. Importantly this can have custom implementations and also be done lazily - e.g. the Posterior object creation may not do anything but keep a reference to the model and to X, and then do custom things during sampling, computation of moments, etc.

I guess in your suggested API the predictor would be more abstract and not tied to a particular X. In particular, the predictor when called on some X could also spit out a Posterior object rather than a (gpy)torch distribution.

It seems like one potential concern here could be that the interface isn't very pytorchic - there wouldn't be a clear mapping to the train and eval modes. But at least I don't think of that as an big issue.

0 replies

gpleiss · 2023-07-06T19:19:19Z

gpleiss
Jul 6, 2023
Maintainer

I also think that this is a reasonable idea. @Turakar if you want to take a stab at this, please feel free to put up a PR draft!

0 replies

Turakar · 2023-07-07T06:51:15Z

Turakar
Jul 7, 2023
Author

I am currently writing my MA, and as such I only take on projects which lead to the completion of my MA at the moment. After that, I am open to take a look at it if nobody else claims it before me.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dedicated Predictor Object #2366

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Dedicated Predictor Object #2366

Turakar Jun 17, 2023

Replies: 3 comments

Balandat Jun 17, 2023 Maintainer

gpleiss Jul 6, 2023 Maintainer

Turakar Jul 7, 2023 Author

Turakar
Jun 17, 2023

Balandat
Jun 17, 2023
Maintainer

gpleiss
Jul 6, 2023
Maintainer

Turakar
Jul 7, 2023
Author