Add MAT #1107

RuanJohn · 2024-10-22T12:02:02Z

What?

Adds the Multi-agent Transformer to Mava.

Extra

The transformer based systems, or any other system with only one set of parameters, do not require us to make a Params NamedTuple with only one element. I had to update the checkpointer to no longer check if the params that are restored are in a FrozenDict. All other systems can still checkpoint and reload with the change. We also no longer need the check since we pin to a version of Flax that is higher than 0.6.11.

…y passing in

…ce type

…fted actions

…tuples

sash-a

Couple small things

mava/networks/attention.py

mava/networks/mat_network.py

mava/systems/mat/anakin/mat.py

test/integration_test.py

Co-authored-by: Sasha Abramowitz <reallysasha@gmail.com>

sash-a

🤖 🤖 🤖 🤖

WiemKhlifi

Some nits else the rest is good to go 🛥️ 🥅

WiemKhlifi · 2024-10-28T14:22:05Z

mava/configs/arch/anakin.yaml

@@ -1,13 +1,13 @@
 # --- Anakin config ---

 # --- Training ---
-num_envs: 16  # Number of vectorised environments per device.
+num_envs: 64  # Number of vectorised environments per device.


Spamming just to revert before merging if not aiming to update these 👀

Suggested change

num_envs: 64 # Number of vectorised environments per device.

num_envs: 16 # Number of vectorised environments per device.

num_evaluation: 200 # Number of evenly spaced evaluations to perform during training.

WiemKhlifi · 2024-10-28T14:23:05Z

mava/configs/env/rware.yaml

@@ -1,7 +1,7 @@
 # ---Environment Configs---
 defaults:
  - _self_
-  - scenario: tiny-2ag # [tiny-2ag, tiny-4ag, tiny-4ag-easy, small-4ag]
+  - scenario: small-4ag # [tiny-2ag, tiny-4ag, tiny-4ag-easy, small-4ag]


Spam: revert before merge

Suggested change

- scenario: small-4ag # [tiny-2ag, tiny-4ag, tiny-4ag-easy, small-4ag]

- scenario: tiny-2ag # [tiny-2ag, tiny-4ag, tiny-4ag-easy, small-4ag]

WiemKhlifi · 2024-10-28T14:30:46Z

mava/configs/system/mat/mat.yaml

+add_agent_id: True
+
+# --- RL hyperparameters ---
+actor_lr: 0.0005 # Learning rate for actor network


Just raising this for future maybe we should change actor_lr for both MAT and Sable to network_lr or something else since it's not really an actor network 💭

WiemKhlifi · 2024-10-28T14:31:41Z

mava/networks/attention.py

+from flax import linen as nn
+from flax.linen.initializers import orthogonal
+
+# TODO: Use einops for all the reshapes and matrix multiplications


Is this still a todo or will be ignored?

WiemKhlifi · 2024-10-28T15:10:51Z

mava/networks/mat_network.py

+)
+from mava.systems.mat.types import MATNetworkConfig
+from mava.types import MavaObservation
+from mava.utils.network_utils import _CONTINUOUS, _DISCRETE


NIT:

Suggested change

from mava.utils.network_utils import _CONTINUOUS, _DISCRETE

from mava.utils.network_utils import CONTINUOUS, DISCRETE

OmaymaMahjoub

Thanks Ruan for all the work on adding MAT, I left some comments but most of them nit and minor suggestions

OmaymaMahjoub · 2024-10-28T15:23:57Z

mava/configs/default/mat.yaml

+  - arch: anakin
+  - system: mat/mat
+  - network: transformer
+  - env: rware


Suggested change

- env: rware

- env: rware # [cleaner, connector, gigastep, lbf, mabrax, matrax, rware, smax]

OmaymaMahjoub · 2024-10-28T15:26:23Z

test/integration_test.py

+
+    _run_system(system_path, cfg)
+
+
 @pytest.mark.parametrize("env_name", discrete_envs)
 def test_discrete_env(fast_config: dict, env_name: str) -> None:
    """Test all discrete envs on random systems."""


if you can MAT to the random choice here

OmaymaMahjoub · 2024-10-28T15:28:34Z

mava/configs/network/transformer.yaml

@@ -0,0 +1,6 @@
+# --- Network params ---
+n_block: 1 # Transformer blocks
+n_embd: 64 # Transformer embedding dimension


for sake of unification maybe we can rename this to embed_dim similar to Sable

OmaymaMahjoub · 2024-10-28T15:32:45Z

mava/networks/torsos.py

+class SwiGLU(nn.Module):
+    ffn_dim: int
+    embed_dim: int
+
+    def setup(self) -> None:
+        self.W_1 = self.param("W_1", nn.initializers.zeros, (self.embed_dim, self.ffn_dim))
+        self.W_G = self.param("W_G", nn.initializers.zeros, (self.embed_dim, self.ffn_dim))
+        self.W_2 = self.param("W_2", nn.initializers.zeros, (self.ffn_dim, self.embed_dim))
+
+    def __call__(self, x: chex.Array) -> chex.Array:
+        return (jax.nn.swish(x @ self.W_G) * (x @ self.W_1)) @ self.W_2
+


Suggested change

class SwiGLU(nn.Module):

ffn_dim: int

embed_dim: int

def setup(self) -> None:

self.W_1 = self.param("W_1", nn.initializers.zeros, (self.embed_dim, self.ffn_dim))

self.W_G = self.param("W_G", nn.initializers.zeros, (self.embed_dim, self.ffn_dim))

self.W_2 = self.param("W_2", nn.initializers.zeros, (self.ffn_dim, self.embed_dim))

def __call__(self, x: chex.Array) -> chex.Array:

return (jax.nn.swish(x @ self.W_G) * (x @ self.W_1)) @ self.W_2

class SwiGLU(nn.Module):

"""SiwGLU module for Sable's Network.

Implements the SwiGLU feedforward neural network module, which is a variation

of the standard feedforward layer using the Swish activation function combined

with a Gated Linear Unit (GLU).

"""

hidden_dim: int

input_dim: int

def setup(self) -> None:

# Initialize the weights for the SwiGLU layer

self.W_linear = self.param(

"W_linear", nn.initializers.zeros, (self.input_dim, self.hidden_dim)

)

self.W_gate = self.param("W_gate", nn.initializers.zeros, (self.input_dim, self.hidden_dim))

self.W_output = self.param(

"W_output", nn.initializers.zeros, (self.hidden_dim, self.input_dim)

)

def __call__(self, x: chex.Array) -> chex.Array:

"""Applies the SwiGLU mechanism to the input tensor `x`."""

# Apply Swish activation to the gated branch and multiply with the linear branch

gated_output = jax.nn.swish(x @ self.W_gate) * (x @ self.W_linear)

# Transform the result back to the input dimension

return gated_output @ self.W_output

I added documentation and updated vars naming in swiglu, i added this suggestion here since MAT PR will be first to be merged, also adding it to torsos.py is better than in the outside utils folder (i will update sable PR based on that)

OmaymaMahjoub · 2024-10-28T15:37:13Z

mava/systems/mat/anakin/mat.py

+
+def get_learner_fn(
+    env: MarlEnv,
+    apply_fns: Tuple[ActorApply, CriticApply],


if you can create type ExecutionApply and TrainApply instead of using actor critic one

OmaymaMahjoub · 2024-10-28T15:46:11Z

mava/systems/mat/anakin/mat.py

+
+    eval_keys = jax.random.split(key_e, n_devices)
+
+    def eval_act_fn(


If we can here follow the ppo systems where we call a maker function from the evaluator instead of creating it here

OmaymaMahjoub · 2024-10-28T15:47:33Z

mava/systems/mat/anakin/mat.py

+
+        # Evaluate.
+        eval_metrics = evaluator(trained_params, eval_keys, {})
+        jax.block_until_ready(eval_metrics)


do we need these block_until_ready? We never added them to other on-policy systems

OmaymaMahjoub · 2024-10-28T15:49:22Z

mava/systems/mat/anakin/mat.py

+
+    # Define network and optimiser.
+    actor_network = MultiAgentTransformer(
+        obs_dim=init_x.agents_view.shape[-1],


we don't need obs_dim as input

OmaymaMahjoub · 2024-10-28T15:52:09Z

mava/systems/mat/anakin/mat.py

+        raise ValueError("Invalid action space type")
+
+    # Define network and optimiser.
+    actor_network = MultiAgentTransformer(


can we rename this to mat_network (very optional)

RuanJohn added 19 commits October 15, 2024 09:05

feat: add discrete MAT and training on rware

2905604

feat: remove huber loss

30d2947

feat: remove value norm

84f0852

feat: use MAT types

ff5ec10

feat: remove autoregressive scans

53dd9d7

feat: use tfp instead of distrax

c8005cb

feat: merge in network refactor

4964fa8

feat: prepare to starting using mava discrete action head

848c625

chore: remove redundant obs being passed around

2d4f7ed

feat: use jax.tree.map instead of deprecated jax.tree_map

e4bc969

feat: squeeze inside of network and not in system run file

b08388d

feat: pass key through trainer to prepare for continuous action spaces

562c82a

feat: continuous actions training

5e233b0

chore: comment cleanup

c18e233

feat: infer batch size and num agents from obs rep instead of manuall…

4e3bf42

…y passing in

Merge branch 'develop' into feat/implement-mat

649a70f

feat: use get_action_head util instead of manually setting action spa…

fd09d59

…ce type

feat: move decoding functions to network utils

8d85d32

chore: duplicate whole info dict at the same time

8357ef5

RuanJohn self-assigned this Oct 22, 2024

RuanJohn requested review from arnupretorius, DriesSmit, jcformanek, siddarthsingh1, sash-a, OmaymaMahjoub, ulricharmel and WiemKhlifi as code owners October 22, 2024 12:02

pull-request-size bot added the size/XXL label Oct 22, 2024

chore: remove unused network file

61d70ca

RuanJohn and others added 20 commits October 24, 2024 14:34

feat: pass in full observation object to network

80711fd

chore: linter

b678bf2

feat: split less keys

2fc8b92

chore: pass in less seeds

aae87cd

feat: rename dimensions

20a10f5

chore: todo about using einops in the future

eee0217

chore: use capital letters for dimensions

32e458a

chore: output projection

91391c7

feat: use make mlp method

26654b8

feat: use .at[].set() with drop instead of jax.lax.cond to update shi…

f205b9e

…fted actions

feat: use model params and optimiser state directly instead of named …

ee3aff6

…tuples

Merge branch 'develop' into feat/implement-mat

c00f54f

Merge branch 'develop' into feat/implement-mat

c620e17

test: add mat to integration tests

66884fb

feat: use network for MAT network init

aff9feb

feat: add MAT network config type

98378f3

chore: corect shape names in the comments

f11c21e

chore: more lightweight network configs

9334319

Merge branch 'develop' into feat/implement-mat

3b68648

Merge branch 'develop' into feat/implement-mat

65538d5

sash-a requested changes Oct 28, 2024

View reviewed changes

RuanJohn and others added 3 commits October 28, 2024 16:19

chore: add dim on new line

3cb5bcd

Co-authored-by: Sasha Abramowitz <reallysasha@gmail.com>

chore: better action encoder init

123f5b1

chore: set correct number of keys

fc2b2bd

sash-a previously approved these changes Oct 28, 2024

View reviewed changes

Merge branch 'develop' into feat/implement-mat

f3c990e

WiemKhlifi requested changes Oct 28, 2024

View reviewed changes

chore: pre-commit

bd4c8bc

RuanJohn dismissed sash-a’s stale review via bd4c8bc October 28, 2024 15:33

OmaymaMahjoub requested changes Oct 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MAT #1107

Add MAT #1107

RuanJohn commented Oct 22, 2024 •

edited

Loading

sash-a left a comment

sash-a left a comment

WiemKhlifi left a comment

WiemKhlifi Oct 28, 2024

WiemKhlifi Oct 28, 2024

WiemKhlifi Oct 28, 2024

WiemKhlifi Oct 28, 2024

WiemKhlifi Oct 28, 2024

OmaymaMahjoub left a comment

OmaymaMahjoub Oct 28, 2024

OmaymaMahjoub Oct 28, 2024

OmaymaMahjoub Oct 28, 2024

OmaymaMahjoub Oct 28, 2024

OmaymaMahjoub Oct 28, 2024

OmaymaMahjoub Oct 28, 2024

OmaymaMahjoub Oct 28, 2024

OmaymaMahjoub Oct 28, 2024

OmaymaMahjoub Oct 28, 2024

OmaymaMahjoub Oct 28, 2024

	num_envs: 64 # Number of vectorised environments per device.
	num_envs: 16 # Number of vectorised environments per device.
	num_evaluation: 200 # Number of evenly spaced evaluations to perform during training.

	- scenario: small-4ag # [tiny-2ag, tiny-4ag, tiny-4ag-easy, small-4ag]
	- scenario: tiny-2ag # [tiny-2ag, tiny-4ag, tiny-4ag-easy, small-4ag]

	from mava.utils.network_utils import _CONTINUOUS, _DISCRETE
	from mava.utils.network_utils import CONTINUOUS, DISCRETE

	- env: rware
	- env: rware # [cleaner, connector, gigastep, lbf, mabrax, matrax, rware, smax]

-class SwiGLU(nn.Module):
-    ffn_dim: int
-    embed_dim: int
-    def setup(self) -> None:
-        self.W_1 = self.param("W_1", nn.initializers.zeros, (self.embed_dim, self.ffn_dim))
-        self.W_G = self.param("W_G", nn.initializers.zeros, (self.embed_dim, self.ffn_dim))
-        self.W_2 = self.param("W_2", nn.initializers.zeros, (self.ffn_dim, self.embed_dim))
-    def __call__(self, x: chex.Array) -> chex.Array:
-        return (jax.nn.swish(x @ self.W_G) * (x @ self.W_1)) @ self.W_2
+class SwiGLU(nn.Module):
+    """SiwGLU module for Sable's Network.
+    Implements the SwiGLU feedforward neural network module, which is a variation
+    of the standard feedforward layer using the Swish activation function combined
+    with a Gated Linear Unit (GLU).
+    """
+    hidden_dim: int
+    input_dim: int
+    def setup(self) -> None:
+        # Initialize the weights for the SwiGLU layer
+        self.W_linear = self.param(
+            "W_linear", nn.initializers.zeros, (self.input_dim, self.hidden_dim)
+        )
+        self.W_gate = self.param("W_gate", nn.initializers.zeros, (self.input_dim, self.hidden_dim))
+        self.W_output = self.param(
+            "W_output", nn.initializers.zeros, (self.hidden_dim, self.input_dim)
+        )
+    def __call__(self, x: chex.Array) -> chex.Array:
+        """Applies the SwiGLU mechanism to the input tensor `x`."""
+        # Apply Swish activation to the gated branch and multiply with the linear branch
+        gated_output = jax.nn.swish(x @ self.W_gate) * (x @ self.W_linear)
+        # Transform the result back to the input dimension
+        return gated_output @ self.W_output


		eval_keys = jax.random.split(key_e, n_devices)

		def eval_act_fn(

Add MAT #1107

Are you sure you want to change the base?

Add MAT #1107

Conversation

RuanJohn commented Oct 22, 2024 • edited Loading

What?

Extra

sash-a left a comment

Choose a reason for hiding this comment

sash-a left a comment

Choose a reason for hiding this comment

WiemKhlifi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OmaymaMahjoub left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RuanJohn commented Oct 22, 2024 •

edited

Loading