-
Notifications
You must be signed in to change notification settings - Fork 725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Determinism #492
Fix Determinism #492
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seed
s default to zero now. IMHO it should default toNone
, in which case we do not set the seed to anything fixed. This would match the previous behavior and the behavior of other derp learning frameworks.self.seed
should be stored in the save file. Looks like just adding it to the list of stored variables should do the trick.- TD3 test is failing for me with current version of
test_0discrete.py
(Non-deterministic results. Only using CPU. Ubuntu 18.04, Python 3.6. Ran withpytest tests/test_0deterministic.py
). test_0discrete.py
should skip tests if GPU is used (GPU adds randomness which is not fixed yet. Running test with GPU results to fails in many algorithms).- Nitpick: What does
0deterministic
stand for intest_0deterministic.py
? Deterministic with seed zero? I find this bit confusing, could be justtest_deterministic.py
as before. Edit: Or is this to make "test separate from other tests"?
Sidenote: Other than the TD3, other algorithms pass the test (ran five separate runs, plus two with SEED=
just in case)
Sorry for the trailing whitespaces
Good point ;) (I had that thought afterward too)
I think we need to investigate more, TD3 has been the one giving me the more problems. Weirdly enough, it seems to be affected by previous execution of models... (running the test with TD3 alone was different from running TD3 after DQN test) The tests pass on travis... What is your tf version? Did you try with the docker image?
Do you have a code snippet for checking that automatically?
yes, that's a hack for having a separate test, otherwise, it would fail...I'm open to any better solution as I don't like that one. |
Now that you mentioned one of the runs randomly failed with both DQN and TD3 failing out of the blue :o . I can try digging in further to this, but I am currently connecting over mobile connections and SSHing is no fun, so things will be tad slow ^^.
Tensorflow 1.14.0 without docker image (I test things without docker as that is where I would personally run things). This is probably the part that causes issues as stable-baselines is based on the older TF version.
I do not know about detection but you could possibly run test with
Aight :). I do not know enough of these automatic systems to be of any help. |
I did additional tests and it seems it comes from tensorflow, I could not reset completely its state and reload is not supported... so I'm a bit stuck (see issue) |
Sounds like something that also can be better fixed along with TF2 support? At this rate we need a detailed list of things to consider when updating to the new backend :) (I know few other suggestions apart from determinism etc). |
I hope so. Referencing #366 then. But I think it still makes sense to merge that partial fix, no? or you prefer to wait for a full patch? |
Oh yes, definitely should merge this one, too :). It is a clear step to right direction. I would only add a notification that TD3 might not work as expected but for other algorithms things work as intended. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with the new part in documentation :)
Any updates for the TD3 failures? I finally found this thread and am experiencing the exact same issues — it seems to randomly fail AFTER a certain sequence of runs. Also, is DDPG affected by this? |
closes #145
closes #285
Related #485
Important Note: In order to fix determinism, I had to break the API of
learn()
method: I moved theseed
argument to the constructor.I also had to separate the determinism test to avoid side effects from other tests