Interface Consistencies #10

etotheipluspi · 2017-04-02T19:46:52Z

I know that Reinforce.jl is not trying to emulate OpenAI gym exactly, but I'm curious behind the reasoning to a couple interface decisions that seem inconsistent with gym's.

First, why doesn't reset!(env) return a state or observation for convenience? From personal experience, when I was using OpenAIGym.jl, reset!(env) was always returning false. This was happening because julia returns the variable on the last line of the function by default, which happened to come from env.done=false. I had to look through the source code to figure out what was happening. Returning a state/observation would be consistent with gym, and would avoid any confusion for new users.

Second, why does step!(env, s, a) return r, s' instead of s',r? This is a minor difference in ordering, but once again, I had an expectation for what step! should return from gym.

The text was updated successfully, but these errors were encountered:

blacksph3re · 2020-07-10T11:56:49Z

And why does step! take a state? Shouldn't that be stored in the env? In CartPole one of the first things the method does is overwrite the state which was handed in with the state from the environment...

etotheipluspi mentioned this issue Apr 16, 2017

POMDPReinforce is Here! JuliaPOMDP/POMDPs.jl#142

Closed

CarloLucibello mentioned this issue Nov 16, 2017

saying hi CarloLucibello/DeepRLexamples.jl#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interface Consistencies #10

Interface Consistencies #10

etotheipluspi commented Apr 2, 2017 •

edited

Loading

blacksph3re commented Jul 10, 2020

Interface Consistencies #10

Interface Consistencies #10

Comments

etotheipluspi commented Apr 2, 2017 • edited Loading

blacksph3re commented Jul 10, 2020

etotheipluspi commented Apr 2, 2017 •

edited

Loading