Fix Determinism #492

araffin · 2019-09-29T16:07:03Z

closes #145
closes #285
Related #485

Important Note: In order to fix determinism, I had to break the API of learn() method: I moved the seed argument to the constructor.
I also had to separate the determinism test to avoid side effects from other tests

Miffyli

seeds default to zero now. IMHO it should default to None, in which case we do not set the seed to anything fixed. This would match the previous behavior and the behavior of other derp learning frameworks.
self.seed should be stored in the save file. Looks like just adding it to the list of stored variables should do the trick.
TD3 test is failing for me with current version of test_0discrete.py (Non-deterministic results. Only using CPU. Ubuntu 18.04, Python 3.6. Ran with pytest tests/test_0deterministic.py).
test_0discrete.py should skip tests if GPU is used (GPU adds randomness which is not fixed yet. Running test with GPU results to fails in many algorithms).
Nitpick: What does 0deterministic stand for in test_0deterministic.py? Deterministic with seed zero? I find this bit confusing, could be just test_deterministic.py as before. Edit: Or is this to make "test separate from other tests"?

Sidenote: Other than the TD3, other algorithms pass the test (ran five separate runs, plus two with SEED= just in case)

_{Sorry for the trailing whitespaces}

araffin · 2019-10-05T10:35:56Z

in which case we do not set the seed to anything fixed.
Done.

self.seed should be stored in the save file. Looks like just adding it to the list of stored variables should do the trick.

Good point ;) (I had that thought afterward too)

TD3 test is failing for me with current version
Sidenote: Other than the TD3, other algorithms pass the test

I think we need to investigate more, TD3 has been the one giving me the more problems. Weirdly enough, it seems to be affected by previous execution of models... (running the test with TD3 alone was different from running TD3 after DQN test)
The thing I don't really understand is that TD3 code is based on SAC but SAC does not fail... I would appreciate some help ;)

The tests pass on travis... What is your tf version? Did you try with the docker image?

should skip tests if GPU is used

Do you have a code snippet for checking that automatically?
And do you have some pointers to solve that issue properly? (I know it is possible with pytorch but I think for tf, it won't be possible before tf 2.0)

Nitpick: What does 0deterministic stand for in

yes, that's a hack for having a separate test, otherwise, it would fail...I'm open to any better solution as I don't like that one.

Miffyli · 2019-10-05T10:46:55Z

I think we need to investigate more, TD3 has been the one giving me the more problems. Weirdly enough, it seems to be affected by previous execution of models... (running the test with TD3 alone was different from running TD3 after DQN test)

Now that you mentioned one of the runs randomly failed with both DQN and TD3 failing out of the blue :o . I can try digging in further to this, but I am currently connecting over mobile connections and SSHing is no fun, so things will be tad slow ^^.

The tests pass on travis... What is your tf version? Did you try with the docker image?

Tensorflow 1.14.0 without docker image (I test things without docker as that is where I would personally run things). This is probably the part that causes issues as stable-baselines is based on the older TF version.

Do you have a code snippet for checking that automatically?

I do not know about detection but you could possibly run test with with tf.device("cpu") block to force running things on CPU. As for fixing the issue, I think we should move that to TF2/next-backend thing, given that TF2 now is officially released.

yes, that's a hack for having a separate test, otherwise, it would fail...I'm open to any better solution as I don't like that one.

Aight :). I do not know enough of these automatic systems to be of any help.

stable_baselines/a2c/a2c.py

araffin · 2019-10-06T21:00:47Z

I did additional tests and it seems it comes from tensorflow, I could not reset completely its state and reload is not supported... so I'm a bit stuck (see issue)

Miffyli · 2019-10-07T13:26:54Z

Sounds like something that also can be better fixed along with TF2 support? At this rate we need a detailed list of things to consider when updating to the new backend :) (I know few other suggestions apart from determinism etc).

araffin · 2019-10-09T08:33:46Z

Sounds like something that also can be better fixed along with TF2 support?

I hope so. Referencing #366 then.

But I think it still makes sense to merge that partial fix, no? or you prefer to wait for a full patch?

Miffyli · 2019-10-09T08:41:22Z

Oh yes, definitely should merge this one, too :). It is a clear step to right direction. I would only add a notification that TD3 might not work as expected but for other algorithms things work as intended.

Miffyli

LGTM with the new part in documentation :)

bycn · 2020-03-06T22:12:54Z

Any updates for the TD3 failures? I finally found this thread and am experiencing the exact same issues — it seems to randomly fail AFTER a certain sequence of runs.

Also, is DDPG affected by this?

araffin added 5 commits January 3, 2019 14:24

Add seed to distributions

1e12fea

Test if we can have reproducible results

f59c71e

Merge branch 'master' into deterministic-fix

32f2715

Merge branch 'master' into deterministic-fix

b26ecf6

Set random seed at graph creation

fd16214

araffin added this to the v2.9.0 milestone Sep 29, 2019

araffin added 9 commits September 29, 2019 18:17

Remove doc

93dfae6

Try harder (remove parallelism)

11aee52

Update test

6e9e0dc

Remove seed param from learn method

ece9b61

Bug fixes

e86836b

Merge branch 'master' into deterministic-fix

9da276c

Make results deterministic

5e1414e

Reduce number of training steps

cbd07ad

Update version

b101ef7

araffin changed the title ~~Deterministic fix [WIP]~~ Fix Determinism Oct 3, 2019

araffin requested review from AdamGleave, ernestum, hill-a and Miffyli October 3, 2019 16:34

araffin marked this pull request as ready for review October 3, 2019 16:34

araffin added 4 commits October 3, 2019 19:58

Try separating tests

3aad796

Remove unused import

f64d91a

Typos

0e1468d

Improve VecEnv seeding

a9797ee

Miffyli requested changes Oct 5, 2019

View reviewed changes

Save seed and default to None

ccab4ef

Miffyli requested changes Oct 5, 2019

View reviewed changes

stable_baselines/a2c/a2c.py Outdated Show resolved Hide resolved

Miffyli and others added 2 commits October 6, 2019 23:51

Update docs for seed parameters

cd1f555

Merge branch 'master' into deterministic-fix

d97f20c

Documentation about reproducibility

b56ae8c

Miffyli approved these changes Oct 11, 2019

View reviewed changes

araffin merged commit 8a8baf1 into master Oct 11, 2019

araffin deleted the deterministic-fix branch October 11, 2019 13:59

Miffyli mentioned this pull request Oct 30, 2019

Different results every time I train a PPO agent #532

Closed

araffin mentioned this pull request May 2, 2020

TD3 not deterministic #838

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Determinism #492

Fix Determinism #492

araffin commented Sep 29, 2019 •

edited

Loading

Miffyli left a comment •

edited

Loading

araffin commented Oct 5, 2019

Miffyli commented Oct 5, 2019

araffin commented Oct 6, 2019

Miffyli commented Oct 7, 2019

araffin commented Oct 9, 2019

Miffyli commented Oct 9, 2019

Miffyli left a comment

bycn commented Mar 6, 2020

Fix Determinism #492

Fix Determinism #492

Conversation

araffin commented Sep 29, 2019 • edited Loading

Miffyli left a comment • edited Loading

Choose a reason for hiding this comment

araffin commented Oct 5, 2019

Miffyli commented Oct 5, 2019

araffin commented Oct 6, 2019

Miffyli commented Oct 7, 2019

araffin commented Oct 9, 2019

Miffyli commented Oct 9, 2019

Miffyli left a comment

Choose a reason for hiding this comment

bycn commented Mar 6, 2020

araffin commented Sep 29, 2019 •

edited

Loading

Miffyli left a comment •

edited

Loading