Skip to content
This repository has been archived by the owner on Oct 7, 2024. It is now read-only.

Tensorflow BOOT DQN agent loses performance after first iteration #46

Open
anyboby opened this issue Jan 5, 2023 · 0 comments
Open

Comments

@anyboby
Copy link

anyboby commented Jan 5, 2023

Hi,

I am observing a strange behavior by the tensorflow default boot dqn agent that I am a bit baffled by.
When running sweeps over multiple environments, the agent loses its expected behavior after the first iteration and does not seem to explore. I've tried to debug for some time but haven't figured out the cause.

Code for reproduction (double-checked in a newly installed env):

import bsuite
from bsuite.baselines.tf import boot_dqn
from bsuite import sweep
from bsuite.baselines import experiment

bsuite_id = "DEEP_SEA"
log_dir = "./logs/"
bsuite_sweep = getattr(sweep, bsuite_id)[:3]

for id in bsuite_sweep:
    env = bsuite.load_and_record(id, save_path=log_dir, overwrite=True)
    agent = boot_dqn.default_agent(
        obs_spec=env.observation_spec(),
        action_spec=env.action_spec(),
    )
    
    experiment.run(agent, env, num_episodes=300)

Iterations 2 and 3 do not reach the end of the chain in 300 episodes and neither in very long training horizons (see also the colab link for results).

In contrast, the jax agent produces the expected results reliably in this loop (i.e., by replacing <bsuite.baselines.tf> with <bsuite.baselines.jax>).

The same can be observed in colab:
https://colab.research.google.com/drive/1hnJMDLG-aXCKKsjFqVd6YWGY4luz29ku?usp=sharing

best,
anyboby

@anyboby anyboby changed the title Tensorflow BOOT DQN agent loses performance after iterations Tensorflow BOOT DQN agent loses performance after first iteration Jan 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant