Hardware recommendations #416
-
First, i have to admit that I know little about gym and its associated tools like Gazebo etc. However recently I was asked for hardware recommendations for doing reinforcement learning. And the terms gym, Gazebo and ROS poped up. I did some research and contacted a few authors of the related papers (such as 1), and was adviced to seek for help from this community. Would be very grateful if someone can recommend the ideal hardware for reinforcement learning in the environment that involves gym + Gazebo + ROS. In particular,
Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 3 replies
-
Hi @chuanli11, welcome!
Your requirements are a bit vague and I can only imagine what you want to achieve. I guess your aim is to train a policy in simulation and then deploy it with minimal effort to a ROS-based robot. This project can maybe help you in the first phase, where you need to feed samples to your preferred RL algorithm. Being based on Ignition Gazebo (note: we don't use ROS at all) you can use all the resources of this ecosystem like SDF support and extend the simulation with the plugin system the Gazebo provides. There are also other options like pybullet, the recently open-sourced (going to be) mujoco, or ISAAC gym from Nvidia. The big advantage that you get with Ignition Gazebo is that sooner or later those physics engine could get integrated, and if you base your code on Gazebo, switching backend is just matter of a single line changed. Though, right now the only supported physics engine is DART, which might or might not fit your needs.
Definitely yes, simulations run entirely on CPU and if you want to parallelize them, a multiprocess setup is recommended. In our experiments we used Rllib, part of the ray project, but any other framework like stable-baselines or similar would work similarly.
We don't have enough data to get a really exhaustive response to this. Running multiple simulations in parallel (e.g. with Rllib) requires many cores, this is a fact. Then, the need of sampling in parallel from a big number of environments depends on the RL algorithm you select. This is to say that, if you use algorithms that do not require a lot of parallel environments (that need many core) it's better having less cores but faster. So I guess the reply is, as often, it depends.
Yes. Most of the RL frameworks allow sampling from simulators, that typically run on CPU as detailed above, and then move the collected samples to GPU where a NN is optimized. This is the only usage of the GPU. If your setup does not involve big samples (for example, you don't have images in the observations) or your NN are not very deep, using the GPU to optimize the NN becomes less important, you could also chose to do everything in the CPU without suffering from a massive performance penalty.
Again, if the NNs are not very deep, and their inputs don't have a big size, the GPU usage is very very low compared to CPU. |
Beta Was this translation helpful? Give feedback.
Hi @chuanli11, welcome!
Your requirements are a bit vague and I can only imagine what you want to achieve. I guess your aim is to train a policy in simulation and then deploy it with minimal effort to a ROS-based robot. This project can maybe help you in the first phase, where you need to feed samples to your preferred RL algo…