Replies: 3 comments
-
I think that makes sense. It's somewhat related to the I guess in your suggested API the predictor would be more abstract and not tied to a particular It seems like one potential concern here could be that the interface isn't very pytorchic - there wouldn't be a clear mapping to the |
Beta Was this translation helpful? Give feedback.
-
I also think that this is a reasonable idea. @Turakar if you want to take a stab at this, please feel free to put up a PR draft! |
Beta Was this translation helpful? Give feedback.
-
I am currently writing my MA, and as such I only take on projects which lead to the completion of my MA at the moment. After that, I am open to take a look at it if nobody else claims it before me. |
Beta Was this translation helpful? Give feedback.
-
Recently, I wrote a custom ExactGP implementation for some special sparsity structure I exploited. This uses an alternative approach to the caching in the ExactPredictionStrategy GPyTorch currently uses: For prediction, you first create a new prediction object which can then be used like a normal module:
Here,
predictor
caches all the necessary values and keeps an internal reference tomodel
, which makes GPU memory usage more predictable (related to #1619, #1787, #1859). This could also easily be extended to support #567. This way, one can decide when to create the posterior cache, what it should contain, and how long to keep it.I personally like this API more, but this might be a matter of preference. I do of course understand that just calling
model(x)
for predictions might be more intuitive, but IMHO Gaussian Processes are not as easy as this. Keeping track of posterior caches is very important for GPU memory usage and latency during prediction.This is not necessarily a breaking change - it should be possible to support both APIs simultaneously via a cached
predictor
object inmodel
, created ifmodel()
is called in eval mode.One drawback is the fact that
model
must always be called in eval mode frompredictor
, either via a shallow copy, keepingmodel
in eval mode or a temporary switch on every call. Although a little bit hacky, I would prefer the shallow copy (i.e. likepredictor.model.load_state_dict(model.state_dict())
).In the end, this design decision is likely a matter of taste.
Beta Was this translation helpful? Give feedback.
All reactions