-
Notifications
You must be signed in to change notification settings - Fork 725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Masks for LstmPolicy in PPO1 #60
Comments
Hi, |
@brendenpetersen, I also encountered the same problem when call setup model in def setup_model(self):
with SetVerbosity(self.verbose):
self.graph = tf.Graph()
with self.graph.as_default():
self.sess = tf_util.single_threaded_session(graph=self.graph)
# Construct network for new policy
self.policy_pi = self.policy(self.sess, self.observation_space, self.action_space, self.n_envs, 1,
None, reuse=False)
# Network for old policy
with tf.variable_scope("oldpi", reuse=False):
old_pi = self.policy(self.sess, self.observation_space, self.action_space, self.n_envs, 1,
None, reuse=False) with tf.variable_scope("input", reuse=False):
if obs_phs is None:
self.obs_ph, self.processed_x = observation_input(ob_space, n_batch, scale=scale)
else:
self.obs_ph, self.processed_x = obs_phs
self.masks_ph = tf.placeholder(tf.float32, [n_batch], name="masks_ph") # mask (done t-1)
self.states_ph = tf.placeholder(tf.float32, [self.n_env, n_lstm * 2], name="states_ph") # states
self.action_ph = None
if add_action_ph:
self.action_ph = tf.placeholder(dtype=ac_space.dtype, shape=(None,) + ac_space.shape, name="action_ph") Fixed the fully defined bug: def setup_model(self):
with SetVerbosity(self.verbose):
self.graph = tf.Graph()
with self.graph.as_default():
self.sess = tf_util.single_threaded_session(graph=self.graph)
# Construct network for new policy
self.policy_pi = self.policy(self.sess, self.observation_space, self.action_space, self.n_envs, 1,
self.n_envs*1, reuse=False)
# Network for old policy
with tf.variable_scope("oldpi", reuse=False):
old_pi = self.policy(self.sess, self.observation_space, self.action_space, self.n_envs, 1,
self.n_envs*1, reuse=False) |
@hejujie That doesn't work for me. To clarify, the only changes you made were changing I get this error:
EDIT: If I wrap |
closing this issue in favor if #140 |
Step function in LstmPolicy is called without masks
I am using ppo1 with LstmPolicy in an environment based on gym. After setup up of model in
pposgd_simple.py
,trpo_mpi.utils.traj_segment_generator
is called in learn function, and thenLstmPolicy.step()
is called without masks intraj_segment_generator()
, while masks is need to be feed inLstmPolicy.step()
, and error was occur here.I also find
step()
is also called bya2c.py
while it get mask from runner(), So I am trying to write some code follow a2c.py. While I want to know whether there are easier way to fixed this.Relate code
learn()
traj_segment_generator()
:LstmPolicy.step()
Error information
The text was updated successfully, but these errors were encountered: