From 48771469e8dabb5913019a5225bb1e44f142d61e Mon Sep 17 00:00:00 2001 From: enjeeneer Date: Tue, 3 Dec 2024 20:55:12 +0000 Subject: [PATCH] spelling --- projects/zero-shot-rl/index.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/projects/zero-shot-rl/index.html b/projects/zero-shot-rl/index.html index 8d7c24b..6c52bea 100644 --- a/projects/zero-shot-rl/index.html +++ b/projects/zero-shot-rl/index.html @@ -64,7 +64,7 @@ [Paper] [Code] [Poster] [Slides] Summary Zero-shot reinforcement learning (RL) methods learn general policies that can, in principle, solve any unseen task in an environment. Recently, methods leveraging successor features and successor measures have emerged as viable zero-shot RL candidates, returning near-optimal policies for many unseen tasks. However, to enable this, they have assumed access to unrealistically large and heterogeneous datasets of transitions for pre-training."> - + @@ -219,7 +219,7 @@

Intuition

Figure 2: Conservative zero-shot RL methods. VC-FB (right) suppresses the predicted values for OOD state-action pairs**


Results

-

We demonstrate our methods improve performance w.r.t standard zero-shot RL / GCRL baselines on low quality datasets from ExORL (Figure 4) and D4RL (Figure 5) . +

We demonstrate our methods improve performance w.r.t standard zero-shot RL / GCRL baselines on low quality datasets from ExORL (Figure 3) and D4RL (Figure 4), and do not hinder performance on large and diverse datasets (Figure 5).