diff --git a/CONTRIBUTORS.md b/CONTRIBUTORS.md
index 723744e2cc..fffb738550 100644
--- a/CONTRIBUTORS.md
+++ b/CONTRIBUTORS.md
@@ -66,6 +66,7 @@ Guidelines for modifications:
 * Michael Gussert
 * Michael Noseworthy
 * Muhong Guo
+* Nicola Loi
 * Nuralem Abizov
 * Oyindamola Omotuyi
 * Özhan Özen
diff --git a/VERSION b/VERSION
index f0bb29e763..88c5fb891d 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-1.3.0
+1.4.0
diff --git a/docs/source/setup/installation/binaries_installation.rst b/docs/source/setup/installation/binaries_installation.rst
index 73138ebd17..99201191cf 100644
--- a/docs/source/setup/installation/binaries_installation.rst
+++ b/docs/source/setup/installation/binaries_installation.rst
@@ -1,12 +1,9 @@
-.. _isaacsim-binaries-installation:
-
+.. _isaaclab-binaries-installation:
 
 Installation using Isaac Sim Binaries
 =====================================
 
-.. note::
-
-   If you use Conda, we recommend using `Miniconda <https://docs.anaconda.com/miniconda/miniconda-other-installer-links/>`_.
+Issac Lab requires Isaac Sim. Install Isaac Sim first, then Isaac Lab.
 
 Installing Isaac Sim
 --------------------
@@ -269,6 +266,10 @@ Setting up the conda environment (optional)
 .. attention::
    This step is optional. If you are using the bundled python with Isaac Sim, you can skip this step.
 
+.. note::
+
+   If you use Conda, we recommend using `Miniconda <https://docs.anaconda.com/miniconda/miniconda-other-installer-links/>`_.
+
 The executable ``isaaclab.sh`` automatically fetches the python bundled with Isaac
 Sim, using ``./isaaclab.sh -p`` command (unless inside a virtual environment). This executable
 behaves like a python executable, and can be used to run any python script or
diff --git a/docs/source/setup/installation/pip_installation.rst b/docs/source/setup/installation/pip_installation.rst
index 83a74194ce..50fa2021a2 100644
--- a/docs/source/setup/installation/pip_installation.rst
+++ b/docs/source/setup/installation/pip_installation.rst
@@ -1,11 +1,9 @@
-.. _isaacsim-pip-installation:
+.. _isaaclab-pip-installation:
 
-Installation using Isaac Sim pip
+Installation using pip
 ================================
 
-.. note::
-
-   If you use Conda, we recommend using `Miniconda <https://docs.anaconda.com/miniconda/miniconda-other-installer-links/>`_.
+Issac Lab requires Isaac Sim. Install Isaac Sim first, then Isaac Lab.
 
 Installing Isaac Sim
 --------------------
@@ -21,11 +19,15 @@ compatibility issues with some Linux distributions. If you encounter any issues,
 
    This may pose compatibility issues with some Linux distributions. For instance, Ubuntu 20.04 LTS has GLIBC 2.31
    by default. If you encounter compatibility issues, we recommend following the
-   :ref:`Isaac Sim Binaries Installation <isaacsim-binaries-installation>` approach.
+   :ref:`Isaac Sim Binaries Installation <isaaclab-binaries-installation>` approach.
 
 .. attention::
 
-      On Windows with CUDA 12, the GPU driver version 552.86 is required.
+   On Windows with CUDA 12, the GPU driver version 552.86 is required.
+
+.. note::
+
+   If you use Conda, we recommend using `Miniconda <https://docs.anaconda.com/miniconda/miniconda-other-installer-links/>`_.
 
 -  To use the pip installation approach for Isaac Sim, we recommend first creating a virtual environment.
    Ensure that the python version of the virtual environment is **Python 3.10**.
@@ -88,19 +90,21 @@ compatibility issues with some Linux distributions. If you encounter any issues,
       pip install --upgrade pip
 
 
--  Then, install the Isaac Sim packages
+-  Then, install the Isaac Sim packages. There are 2 options: A complete installation, or a minimal installation for running Isaac Lab only.
 
-   .. code-block:: bash
+   -  Complete installation:
 
-      pip install isaacsim==4.2.0.2 isaacsim-extscache-physics==4.2.0.2 isaacsim-extscache-kit==4.2.0.2 isaacsim-extscache-kit-sdk==4.2.0.2 --extra-index-url https://pypi.nvidia.com
+      .. code-block:: bash
 
+         pip install isaacsim==4.2.0.2 isaacsim-extscache-physics==4.2.0.2 isaacsim-extscache-kit==4.2.0.2 isaacsim-extscache-kit-sdk==4.2.0.2 --extra-index-url https://pypi.nvidia.com
 
-- To install a minimal set of packages for running Isaac Lab only, the following command can be used. Note that you cannot run ``isaacsim`` with this.
+   -  Minimal set of packages for running Isaac Lab only:
 
-   .. code-block:: bash
+      .. code-block:: bash
 
-      pip install isaacsim-rl isaacsim-replicator isaacsim-extscache-physics isaacsim-extscache-kit-sdk isaacsim-extscache-kit isaacsim-app --extra-index-url https://pypi.nvidia.com
+         pip install isaacsim-rl isaacsim-replicator isaacsim-extscache-physics isaacsim-extscache-kit-sdk isaacsim-extscache-kit isaacsim-app --extra-index-url https://pypi.nvidia.com
 
+      Note that you cannot run ``isaacsim`` with this.
 
 Verifying the Isaac Sim installation
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -131,7 +135,9 @@ Verifying the Isaac Sim installation
    This process can take upwards of 10 minutes and is required on the first run of each experience file.
    Once the extensions are pulled, consecutive runs using the same experience file will use the cached extensions.
 
-   In addition, the first run will prompt users to accept the Nvidia Omniverse License Agreement.
+.. attention::
+
+   The first run will prompt users to accept the Nvidia Omniverse License Agreement.
    To accept the EULA, reply ``Yes`` when prompted with the below message:
 
    .. code:: bash
diff --git a/source/apps/isaaclab.python.headless.kit b/source/apps/isaaclab.python.headless.kit
index 9d13464ecf..c5f2b428fb 100644
--- a/source/apps/isaaclab.python.headless.kit
+++ b/source/apps/isaaclab.python.headless.kit
@@ -5,7 +5,7 @@
 [package]
 title = "Isaac Lab Python Headless"
 description = "An app for running Isaac Lab headlessly"
-version = "1.3.0"
+version = "1.4.0"
 
 # That makes it browsable in UI with "experience" filter
 keywords = ["experience", "app", "isaaclab", "python", "headless"]
diff --git a/source/apps/isaaclab.python.headless.rendering.kit b/source/apps/isaaclab.python.headless.rendering.kit
index cd39d0a797..f85fdd262d 100644
--- a/source/apps/isaaclab.python.headless.rendering.kit
+++ b/source/apps/isaaclab.python.headless.rendering.kit
@@ -9,7 +9,7 @@
 [package]
 title = "Isaac Lab Python Headless Camera"
 description = "An app for running Isaac Lab headlessly with rendering enabled"
-version = "1.3.0"
+version = "1.4.0"
 
 # That makes it browsable in UI with "experience" filter
 keywords = ["experience", "app", "isaaclab", "python", "camera", "minimal"]
diff --git a/source/apps/isaaclab.python.kit b/source/apps/isaaclab.python.kit
index 7ea6fc1a05..1d1d923a30 100644
--- a/source/apps/isaaclab.python.kit
+++ b/source/apps/isaaclab.python.kit
@@ -5,7 +5,7 @@
 [package]
 title = "Isaac Lab Python"
 description = "An app for running Isaac Lab"
-version = "1.3.0"
+version = "1.4.0"
 
 # That makes it browsable in UI with "experience" filter
 keywords = ["experience", "app", "usd"]
diff --git a/source/apps/isaaclab.python.rendering.kit b/source/apps/isaaclab.python.rendering.kit
index 38a48b7460..4a7f8650a7 100644
--- a/source/apps/isaaclab.python.rendering.kit
+++ b/source/apps/isaaclab.python.rendering.kit
@@ -9,7 +9,7 @@
 [package]
 title = "Isaac Lab Python Camera"
 description = "An app for running Isaac Lab with rendering enabled"
-version = "1.3.0"
+version = "1.4.0"
 
 # That makes it browsable in UI with "experience" filter
 keywords = ["experience", "app", "isaaclab", "python", "camera", "minimal"]
diff --git a/source/extensions/omni.isaac.lab/docs/CHANGELOG.rst b/source/extensions/omni.isaac.lab/docs/CHANGELOG.rst
index 3ec4432182..587cda4ce5 100644
--- a/source/extensions/omni.isaac.lab/docs/CHANGELOG.rst
+++ b/source/extensions/omni.isaac.lab/docs/CHANGELOG.rst
@@ -8,7 +8,16 @@ Fixed
 ^^^^^
 
 * fixed docstring in articulation data :class:`omni.isaac.lab.assets.ArticulationData`.
-  In body properties sections, the second dimension should be num_bodies but was documented as 1 .
+  In body properties sections, the second dimension should be num_bodies but was documented as 1.
+
+
+0.30.3 (2025-01-02)
+~~~~~~~~~~~~~~~~~~~
+
+Added
+^^^^^
+
+* Added body tracking as an origin type to :class:`omni.isaac.lab.envs.ViewerCfg` and :class:`omni.isaac.lab.envs.ui.ViewportCameraController`.
 
 
 0.30.2 (2024-12-22)
diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/actuators/actuator_cfg.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/actuators/actuator_cfg.py
index a97cac2035..9c44d6e6da 100644
--- a/source/extensions/omni.isaac.lab/omni/isaac/lab/actuators/actuator_cfg.py
+++ b/source/extensions/omni.isaac.lab/omni/isaac/lab/actuators/actuator_cfg.py
@@ -3,7 +3,6 @@
 #
 # SPDX-License-Identifier: BSD-3-Clause
 
-import torch
 from collections.abc import Iterable
 from dataclasses import MISSING
 from typing import Literal
@@ -181,7 +180,7 @@ class RemotizedPDActuatorCfg(DelayedPDActuatorCfg):
 
     class_type: type = actuator_pd.RemotizedPDActuator
 
-    joint_parameter_lookup: torch.Tensor = MISSING
+    joint_parameter_lookup: list[list[float]] = MISSING
     """Joint parameter lookup table. Shape is (num_lookup_points, 3).
 
     This tensor describes the relationship between the joint angle (rad), the transmission ratio (in/out),
diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/actuators/actuator_pd.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/actuators/actuator_pd.py
index e47e70ad34..08c5005758 100644
--- a/source/extensions/omni.isaac.lab/omni/isaac/lab/actuators/actuator_pd.py
+++ b/source/extensions/omni.isaac.lab/omni/isaac/lab/actuators/actuator_pd.py
@@ -324,7 +324,7 @@ def __init__(
         super().__init__(
             cfg, joint_names, joint_ids, num_envs, device, stiffness, damping, armature, friction, torch.inf, torch.inf
         )
-        self._joint_parameter_lookup = cfg.joint_parameter_lookup.to(device=device)
+        self._joint_parameter_lookup = torch.tensor(cfg.joint_parameter_lookup, device=device)
         # define remotized joint torque limit
         self._torque_limit = LinearInterpolation(self.angle_samples, self.max_torque_samples, device=device)
 
diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/assets/rigid_object/rigid_object_data.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/assets/rigid_object/rigid_object_data.py
index 6e799794fd..fef13fbb1c 100644
--- a/source/extensions/omni.isaac.lab/omni/isaac/lab/assets/rigid_object/rigid_object_data.py
+++ b/source/extensions/omni.isaac.lab/omni/isaac/lab/assets/rigid_object/rigid_object_data.py
@@ -202,7 +202,7 @@ def body_state_w(self):
     @property
     def body_link_state_w(self):
         """State of all bodies `[pos, quat, lin_vel, ang_vel]` in simulation world frame.
-        Shape is (num_instances,1, 13).
+        Shape is (num_instances, 1, 13).
 
         The position, quaternion, and linear/angular velocity are of the body's link frame relative to the world.
         """
diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/common.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/common.py
index 87847b2648..d2cc602a4a 100644
--- a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/common.py
+++ b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/common.py
@@ -36,7 +36,7 @@ class ViewerCfg:
     Default is (1280, 720).
     """
 
-    origin_type: Literal["world", "env", "asset_root"] = "world"
+    origin_type: Literal["world", "env", "asset_root", "asset_body"] = "world"
     """The frame in which the camera position (eye) and target (lookat) are defined in. Default is "world".
 
     Available options are:
@@ -44,6 +44,7 @@ class ViewerCfg:
     * ``"world"``: The origin of the world.
     * ``"env"``: The origin of the environment defined by :attr:`env_index`.
     * ``"asset_root"``: The center of the asset defined by :attr:`asset_name` in environment :attr:`env_index`.
+    * ``"asset_body"``: The center of the body defined by :attr:`body_name` in asset defined by :attr:`asset_name` in environment :attr:`env_index`.
     """
 
     env_index: int = 0
@@ -58,6 +59,12 @@ class ViewerCfg:
     This quantity is only effective if :attr:`origin` is set to "asset_root".
     """
 
+    body_name: str | None = None
+    """The name of the body in :attr:`asset_name` in the interactive scene for the frame origin. Default is None.
+
+    This quantity is only effective if :attr:`origin` is set to "asset_body".
+    """
+
 
 ##
 # Types.
diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/ui/viewport_camera_controller.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/ui/viewport_camera_controller.py
index d0e9e94852..cd97db231d 100644
--- a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/ui/viewport_camera_controller.py
+++ b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/ui/viewport_camera_controller.py
@@ -15,6 +15,8 @@
 import omni.kit.app
 import omni.timeline
 
+from omni.isaac.lab.assets.articulation.articulation import Articulation
+
 if TYPE_CHECKING:
     from omni.isaac.lab.envs import DirectRLEnv, ManagerBasedEnv, ViewerCfg
 
@@ -59,12 +61,15 @@ def __init__(self, env: ManagerBasedEnv | DirectRLEnv, cfg: ViewerCfg):
             self.set_view_env_index(self.cfg.env_index)
             # set the camera origin to the center of the environment
             self.update_view_to_env()
-        elif self.cfg.origin_type == "asset_root":
+        elif self.cfg.origin_type == "asset_root" or self.cfg.origin_type == "asset_body":
             # note: we do not yet update camera for tracking an asset origin, as the asset may not yet be
             # in the scene when this is called. Instead, we subscribe to the post update event to update the camera
             # at each rendering step.
             if self.cfg.asset_name is None:
                 raise ValueError(f"No asset name provided for viewer with origin type: '{self.cfg.origin_type}'.")
+            if self.cfg.origin_type == "asset_body":
+                if self.cfg.body_name is None:
+                    raise ValueError(f"No body name provided for viewer with origin type: '{self.cfg.origin_type}'.")
         else:
             # set the camera origin to the center of the world
             self.update_view_to_world()
@@ -160,6 +165,41 @@ def update_view_to_asset_root(self, asset_name: str):
         # update the camera view
         self.update_view_location()
 
+    def update_view_to_asset_body(self, asset_name: str, body_name: str):
+        """Updates the viewer's origin based upon the body of an asset in the scene.
+
+        Args:
+            asset_name: The name of the asset in the scene. The name should match the name of the
+                asset in the scene.
+            body_name: The name of the body in the asset.
+
+        Raises:
+            ValueError: If the asset is not in the scene or the body is not valid.
+        """
+        # check if the asset is in the scene
+        if self.cfg.asset_name != asset_name:
+            asset_entities = [*self._env.scene.rigid_objects.keys(), *self._env.scene.articulations.keys()]
+            if asset_name not in asset_entities:
+                raise ValueError(f"Asset '{asset_name}' is not in the scene. Available entities: {asset_entities}.")
+        # check if the body is in the asset
+        asset: Articulation = self._env.scene[asset_name]
+        if body_name not in asset.body_names:
+            raise ValueError(
+                f"'{body_name}' is not a body of Asset '{asset_name}'. Available bodies: {asset.body_names}."
+            )
+        # get the body index
+        body_id, _ = asset.find_bodies(body_name)
+        # update the asset name
+        self.cfg.asset_name = asset_name
+        # set origin type to asset_body
+        self.cfg.origin_type = "asset_body"
+        # update the camera origins
+        self.viewer_origin = (
+            self._env.scene[self.cfg.asset_name].data.body_link_pos_w[self.cfg.env_index, body_id].view(3)
+        )
+        # update the camera view
+        self.update_view_location()
+
     def update_view_location(self, eye: Sequence[float] | None = None, lookat: Sequence[float] | None = None):
         """Updates the camera view pose based on the current viewer origin and the eye and lookat positions.
 
@@ -190,3 +230,5 @@ def _update_tracking_callback(self, event):
         # in other cases, the camera view is static and does not need to be updated continuously
         if self.cfg.origin_type == "asset_root" and self.cfg.asset_name is not None:
             self.update_view_to_asset_root(self.cfg.asset_name)
+        if self.cfg.origin_type == "asset_body" and self.cfg.asset_name is not None and self.cfg.body_name is not None:
+            self.update_view_to_asset_body(self.cfg.asset_name, self.cfg.body_name)
diff --git a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/utils/marl.py b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/utils/marl.py
index 5149bc2066..3fac404a86 100644
--- a/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/utils/marl.py
+++ b/source/extensions/omni.isaac.lab/omni/isaac/lab/envs/utils/marl.py
@@ -58,6 +58,7 @@ def __init__(self, env: DirectMARLEnv) -> None:
             self.cfg = self.env.cfg
             self.sim = self.env.sim
             self.scene = self.env.scene
+            self.render_mode = self.env.render_mode
 
             self.single_observation_space = gym.spaces.Dict()
             if self._state_as_observation:
@@ -126,7 +127,7 @@ def step(self, action: torch.Tensor) -> VecEnvStepReturn:
             return obs, rewards, terminated, time_outs, extras
 
         def render(self, recompute: bool = False) -> np.ndarray | None:
-            self.env.render(recompute)
+            return self.env.render(recompute)
 
         def close(self) -> None:
             self.env.close()
diff --git a/source/extensions/omni.isaac.lab_assets/omni/isaac/lab_assets/spot.py b/source/extensions/omni.isaac.lab_assets/omni/isaac/lab_assets/spot.py
index 8aadd815ae..ff7830ae4f 100644
--- a/source/extensions/omni.isaac.lab_assets/omni/isaac/lab_assets/spot.py
+++ b/source/extensions/omni.isaac.lab_assets/omni/isaac/lab_assets/spot.py
@@ -11,15 +11,13 @@
 * :obj:`SPOT_CFG`: The Spot robot with delay PD and remote PD actuators.
 """
 
-import torch
-
 import omni.isaac.lab.sim as sim_utils
 from omni.isaac.lab.actuators import DelayedPDActuatorCfg, RemotizedPDActuatorCfg
 from omni.isaac.lab.assets.articulation import ArticulationCfg
 from omni.isaac.lab.utils.assets import ISAAC_NUCLEUS_DIR
 
 # Note: This data was collected by the Boston Dynamics AI Institute.
-joint_parameter_lookup = torch.tensor([
+joint_parameter_lookup = [
     [-2.792900, -24.776718, 37.165077],
     [-2.767442, -26.290108, 39.435162],
     [-2.741984, -27.793369, 41.690054],
@@ -121,7 +119,7 @@
     [-0.298016, -24.632576, 36.948864],
     [-0.272558, -22.528547, 33.792821],
     [-0.247100, -20.401667, 30.602500],
-])
+]
 """The lookup table for the knee joint parameters of the Boston Dynamics Spot robot.
 
 This table describes the relationship between the joint angle (rad), the transmission ratio (in/out),
diff --git a/source/extensions/omni.isaac.lab_tasks/config/extension.toml b/source/extensions/omni.isaac.lab_tasks/config/extension.toml
index 7f8033187f..e841f953a7 100644
--- a/source/extensions/omni.isaac.lab_tasks/config/extension.toml
+++ b/source/extensions/omni.isaac.lab_tasks/config/extension.toml
@@ -1,7 +1,7 @@
 [package]
 
 # Note: Semantic Versioning is used: https://semver.org/
-version = "0.10.17"
+version = "0.10.18"
 
 # Description
 title = "Isaac Lab Environments"
diff --git a/source/extensions/omni.isaac.lab_tasks/docs/CHANGELOG.rst b/source/extensions/omni.isaac.lab_tasks/docs/CHANGELOG.rst
index 53f2b01d2b..1b5d49f006 100644
--- a/source/extensions/omni.isaac.lab_tasks/docs/CHANGELOG.rst
+++ b/source/extensions/omni.isaac.lab_tasks/docs/CHANGELOG.rst
@@ -1,6 +1,15 @@
 Changelog
 ---------
 
+0.10.18 (2025-01-03)
+~~~~~~~~~~~~~~~~~~~
+
+Fixed
+^^^^^
+
+* Fixed the reset of the actions in the function overriding of the low level observations of :class:`omni.isaac.lab_tasks.manager_based.navigation.mdp.PreTrainedPolicyAction`.
+
+
 0.10.17 (2024-12-17)
 ~~~~~~~~~~~~~~~~~~~~
 
diff --git a/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_ippo_cfg.yaml b/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_ippo_cfg.yaml
index b795d9d081..bc0c518217 100644
--- a/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_ippo_cfg.yaml
+++ b/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_ippo_cfg.yaml
@@ -76,5 +76,5 @@ agent:
 # https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html
 trainer:
   class: SequentialTrainer
-  timesteps: 1600
+  timesteps: 4800
   environment_info: log
diff --git a/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_mappo_cfg.yaml b/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_mappo_cfg.yaml
index fc2f07de55..dcd794f57a 100644
--- a/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_mappo_cfg.yaml
+++ b/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_mappo_cfg.yaml
@@ -78,5 +78,5 @@ agent:
 # https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html
 trainer:
   class: SequentialTrainer
-  timesteps: 1600
+  timesteps: 4800
   environment_info: log
diff --git a/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_ppo_cfg.yaml b/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_ppo_cfg.yaml
index 160ebcde60..7c1fd452d7 100644
--- a/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_ppo_cfg.yaml
+++ b/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/direct/cart_double_pendulum/agents/skrl_ppo_cfg.yaml
@@ -76,5 +76,5 @@ agent:
 # https://skrl.readthedocs.io/en/latest/api/trainers/sequential.html
 trainer:
   class: SequentialTrainer
-  timesteps: 1600
+  timesteps: 4800
   environment_info: log
diff --git a/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/navigation/mdp/pre_trained_policy_action.py b/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/navigation/mdp/pre_trained_policy_action.py
index 860657112f..bf6dea3fb5 100644
--- a/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/navigation/mdp/pre_trained_policy_action.py
+++ b/source/extensions/omni.isaac.lab_tasks/omni/isaac/lab_tasks/manager_based/navigation/mdp/pre_trained_policy_action.py
@@ -50,8 +50,14 @@ def __init__(self, cfg: PreTrainedPolicyActionCfg, env: ManagerBasedRLEnv) -> No
         self._low_level_action_term: ActionTerm = cfg.low_level_actions.class_type(cfg.low_level_actions, env)
         self.low_level_actions = torch.zeros(self.num_envs, self._low_level_action_term.action_dim, device=self.device)
 
+        def last_action():
+            # reset the low level actions if the episode was reset
+            if hasattr(env, "episode_length_buf"):
+                self.low_level_actions[env.episode_length_buf == 0, :] = 0
+            return self.low_level_actions
+
         # remap some of the low level observations to internal observations
-        cfg.low_level_observations.actions.func = lambda dummy_env: self.low_level_actions
+        cfg.low_level_observations.actions.func = lambda dummy_env: last_action()
         cfg.low_level_observations.actions.params = dict()
         cfg.low_level_observations.velocity_commands.func = lambda dummy_env: self._raw_actions
         cfg.low_level_observations.velocity_commands.params = dict()
diff --git a/source/standalone/workflows/rl_games/play.py b/source/standalone/workflows/rl_games/play.py
index 06206cb4ba..60e2a23846 100644
--- a/source/standalone/workflows/rl_games/play.py
+++ b/source/standalone/workflows/rl_games/play.py
@@ -94,6 +94,11 @@ def main():
 
     # create isaac environment
     env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
+
+    # convert to single-agent instance if required by the RL algorithm
+    if isinstance(env.unwrapped, DirectMARLEnv):
+        env = multi_agent_to_single_agent(env)
+
     # wrap for video recording
     if args_cli.video:
         video_kwargs = {
@@ -106,10 +111,6 @@ def main():
         print_dict(video_kwargs, nesting=4)
         env = gym.wrappers.RecordVideo(env, **video_kwargs)
 
-    # convert to single-agent instance if required by the RL algorithm
-    if isinstance(env.unwrapped, DirectMARLEnv):
-        env = multi_agent_to_single_agent(env)
-
     # wrap around environment for rl-games
     env = RlGamesVecEnvWrapper(env, rl_device, clip_obs, clip_actions)
 
diff --git a/source/standalone/workflows/rl_games/train.py b/source/standalone/workflows/rl_games/train.py
index f098427e17..5e03a25635 100644
--- a/source/standalone/workflows/rl_games/train.py
+++ b/source/standalone/workflows/rl_games/train.py
@@ -129,6 +129,11 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
 
     # create isaac environment
     env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
+
+    # convert to single-agent instance if required by the RL algorithm
+    if isinstance(env.unwrapped, DirectMARLEnv):
+        env = multi_agent_to_single_agent(env)
+
     # wrap for video recording
     if args_cli.video:
         video_kwargs = {
@@ -141,10 +146,6 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
         print_dict(video_kwargs, nesting=4)
         env = gym.wrappers.RecordVideo(env, **video_kwargs)
 
-    # convert to single-agent instance if required by the RL algorithm
-    if isinstance(env.unwrapped, DirectMARLEnv):
-        env = multi_agent_to_single_agent(env)
-
     # wrap around environment for rl-games
     env = RlGamesVecEnvWrapper(env, rl_device, clip_obs, clip_actions)
 
diff --git a/source/standalone/workflows/rsl_rl/play.py b/source/standalone/workflows/rsl_rl/play.py
index 25331b85d9..4983b86063 100644
--- a/source/standalone/workflows/rsl_rl/play.py
+++ b/source/standalone/workflows/rsl_rl/play.py
@@ -74,6 +74,11 @@ def main():
 
     # create isaac environment
     env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
+
+    # convert to single-agent instance if required by the RL algorithm
+    if isinstance(env.unwrapped, DirectMARLEnv):
+        env = multi_agent_to_single_agent(env)
+
     # wrap for video recording
     if args_cli.video:
         video_kwargs = {
@@ -86,10 +91,6 @@ def main():
         print_dict(video_kwargs, nesting=4)
         env = gym.wrappers.RecordVideo(env, **video_kwargs)
 
-    # convert to single-agent instance if required by the RL algorithm
-    if isinstance(env.unwrapped, DirectMARLEnv):
-        env = multi_agent_to_single_agent(env)
-
     # wrap around environment for rsl-rl
     env = RslRlVecEnvWrapper(env)
 
diff --git a/source/standalone/workflows/rsl_rl/train.py b/source/standalone/workflows/rsl_rl/train.py
index 71773fd4e0..293fc6867b 100644
--- a/source/standalone/workflows/rsl_rl/train.py
+++ b/source/standalone/workflows/rsl_rl/train.py
@@ -100,6 +100,10 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
     # create isaac environment
     env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
 
+    # convert to single-agent instance if required by the RL algorithm
+    if isinstance(env.unwrapped, DirectMARLEnv):
+        env = multi_agent_to_single_agent(env)
+
     # save resume path before creating a new log_dir
     if agent_cfg.resume:
         resume_path = get_checkpoint_path(log_root_path, agent_cfg.load_run, agent_cfg.load_checkpoint)
@@ -116,10 +120,6 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
         print_dict(video_kwargs, nesting=4)
         env = gym.wrappers.RecordVideo(env, **video_kwargs)
 
-    # convert to single-agent instance if required by the RL algorithm
-    if isinstance(env.unwrapped, DirectMARLEnv):
-        env = multi_agent_to_single_agent(env)
-
     # wrap around environment for rsl-rl
     env = RslRlVecEnvWrapper(env)
 
diff --git a/source/standalone/workflows/sb3/play.py b/source/standalone/workflows/sb3/play.py
index 10a228d5f7..b33dd8d76d 100644
--- a/source/standalone/workflows/sb3/play.py
+++ b/source/standalone/workflows/sb3/play.py
@@ -48,6 +48,7 @@
 from stable_baselines3 import PPO
 from stable_baselines3.common.vec_env import VecNormalize
 
+from omni.isaac.lab.envs import DirectMARLEnv, multi_agent_to_single_agent
 from omni.isaac.lab.utils.dict import print_dict
 
 import omni.isaac.lab_tasks  # noqa: F401
@@ -82,6 +83,11 @@ def main():
 
     # create isaac environment
     env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
+
+    # convert to single-agent instance if required by the RL algorithm
+    if isinstance(env.unwrapped, DirectMARLEnv):
+        env = multi_agent_to_single_agent(env)
+
     # wrap for video recording
     if args_cli.video:
         video_kwargs = {
diff --git a/source/standalone/workflows/sb3/train.py b/source/standalone/workflows/sb3/train.py
index 641b9de7af..b5e01855b9 100644
--- a/source/standalone/workflows/sb3/train.py
+++ b/source/standalone/workflows/sb3/train.py
@@ -104,6 +104,11 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
 
     # create isaac environment
     env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
+
+    # convert to single-agent instance if required by the RL algorithm
+    if isinstance(env.unwrapped, DirectMARLEnv):
+        env = multi_agent_to_single_agent(env)
+
     # wrap for video recording
     if args_cli.video:
         video_kwargs = {
@@ -116,10 +121,6 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
         print_dict(video_kwargs, nesting=4)
         env = gym.wrappers.RecordVideo(env, **video_kwargs)
 
-    # convert to single-agent instance if required by the RL algorithm
-    if isinstance(env.unwrapped, DirectMARLEnv):
-        env = multi_agent_to_single_agent(env)
-
     # wrap around environment for stable baselines
     env = Sb3VecEnvWrapper(env)
 
diff --git a/source/standalone/workflows/skrl/play.py b/source/standalone/workflows/skrl/play.py
index 0be969a763..912022b1ee 100644
--- a/source/standalone/workflows/skrl/play.py
+++ b/source/standalone/workflows/skrl/play.py
@@ -116,6 +116,11 @@ def main():
 
     # create isaac environment
     env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
+
+    # convert to single-agent instance if required by the RL algorithm
+    if isinstance(env.unwrapped, DirectMARLEnv) and algorithm in ["ppo"]:
+        env = multi_agent_to_single_agent(env)
+
     # wrap for video recording
     if args_cli.video:
         video_kwargs = {
@@ -128,10 +133,6 @@ def main():
         print_dict(video_kwargs, nesting=4)
         env = gym.wrappers.RecordVideo(env, **video_kwargs)
 
-    # convert to single-agent instance if required by the RL algorithm
-    if isinstance(env.unwrapped, DirectMARLEnv) and algorithm in ["ppo"]:
-        env = multi_agent_to_single_agent(env)
-
     # wrap around environment for skrl
     env = SkrlVecEnvWrapper(env, ml_framework=args_cli.ml_framework)  # same as: `wrap_env(env, wrapper="auto")`
 
diff --git a/source/standalone/workflows/skrl/train.py b/source/standalone/workflows/skrl/train.py
index f8ce5cd42a..90b7a0ed52 100644
--- a/source/standalone/workflows/skrl/train.py
+++ b/source/standalone/workflows/skrl/train.py
@@ -151,6 +151,11 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
 
     # create isaac environment
     env = gym.make(args_cli.task, cfg=env_cfg, render_mode="rgb_array" if args_cli.video else None)
+
+    # convert to single-agent instance if required by the RL algorithm
+    if isinstance(env.unwrapped, DirectMARLEnv) and algorithm in ["ppo"]:
+        env = multi_agent_to_single_agent(env)
+
     # wrap for video recording
     if args_cli.video:
         video_kwargs = {
@@ -163,10 +168,6 @@ def main(env_cfg: ManagerBasedRLEnvCfg | DirectRLEnvCfg | DirectMARLEnvCfg, agen
         print_dict(video_kwargs, nesting=4)
         env = gym.wrappers.RecordVideo(env, **video_kwargs)
 
-    # convert to single-agent instance if required by the RL algorithm
-    if isinstance(env.unwrapped, DirectMARLEnv) and algorithm in ["ppo"]:
-        env = multi_agent_to_single_agent(env)
-
     # wrap around environment for skrl
     env = SkrlVecEnvWrapper(env, ml_framework=args_cli.ml_framework)  # same as: `wrap_env(env, wrapper="auto")`