-
Notifications
You must be signed in to change notification settings - Fork 725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[question] Cartpole PPO1 example and alternate policies #35
Comments
Hello, Did I answer you question? @hill-a we should maybe update the doc to prevent those type of errors, no? |
Hey, Probably best to add an update to the documentation, and to add a check to the models to make sure the input for the policy |
@araffin , I'm not OP, but the PPO1 (and TRPO) example CartPole script doesn't work for me when using any recurrent policies, e.g.
|
@hill-a looks like a bug... Can you check that? |
Closing this issue (which is now a duplicated of #60 ) |
From the provided example it appears as if you should be able to swap in different policy implementations for
MlpPolicy
and have the example code run. This does not appear to be the case, so I suspect I'm misunderstanding something. To use something other thanMlpPolicy
what should a user know? I haven't read all the docs thoroughly so I apologize if this is clearly spelled out somewhere!System Info
Describe the characteristic of your environment:
pip
CPU only
3.6.4
1.8
Additional context
An example traceback of trying to use one of the other policies
/Users/iandanforth/.pyenv/versions/3.6.4/lib/python3.6/site-packages/stable_baselines/common/input.py:30: RuntimeWarning: overflow encountered in subtract
np.any((ob_space.high - ob_space.low) != 0)):
/Users/iandanforth/.pyenv/versions/3.6.4/lib/python3.6/site-packages/stable_baselines/common/input.py:33: RuntimeWarning: overflow encountered in subtract
processed_x = ((processed_x - ob_space.low) / (ob_space.high - ob_space.low))
Traceback (most recent call last):
File "agents/ppo.py", line 9, in
model = PPO1(CnnLnLstmPolicy, env, verbose=1)
File "/Users/iandanforth/.pyenv/versions/3.6.4/lib/python3.6/site-packages/stable_baselines/ppo1/pposgd_simple.py", line 77, in init
self.setup_model()
File "/Users/iandanforth/.pyenv/versions/3.6.4/lib/python3.6/site-packages/stable_baselines/ppo1/pposgd_simple.py", line 88, in setup_model
None, reuse=False)
File "/Users/iandanforth/.pyenv/versions/3.6.4/lib/python3.6/site-packages/stable_baselines/common/policies.py", line 349, in init
layer_norm=True, feature_extraction="cnn", **_kwargs)
File "/Users/iandanforth/.pyenv/versions/3.6.4/lib/python3.6/site-packages/stable_baselines/common/policies.py", line 192, in init
extracted_features = cnn_extractor(self.processed_x, **kwargs)
File "/Users/iandanforth/.pyenv/versions/3.6.4/lib/python3.6/site-packages/stable_baselines/common/policies.py", line 21, in nature_cnn
layer_1 = activ(conv(scaled_images, 'c1', n_filters=32, filter_size=8, stride=4, init_scale=np.sqrt(2), **kwargs))
File "/Users/iandanforth/.pyenv/versions/3.6.4/lib/python3.6/site-packages/stable_baselines/a2c/utils.py", line 122, in conv
n_input = input_tensor.get_shape()[channel_ax].value
File "/Users/iandanforth/.pyenv/versions/3.6.4/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py", line 612, in getitem
return self._dims[key]
IndexError: list index out of range
The text was updated successfully, but these errors were encountered: