Skip to content

AWS 2020 re:Invent DeepRacer Competition work and GWU Cloud Computing Project

Notifications You must be signed in to change notification settings

mwilchek/DeepRacer-2020

Repository files navigation

AWS DeepRacer for re:Invent 2020

This is an informal log of my exploration of AWS DeepRacer training. My best iteration was submitted for the AWS 2020 re:Invent DeepRacer Competition. All of the work was also submitted as a final project for a graduate Cloud Computing course for the M.S. Data Science program at George Washington University.

Reward Functions

Iteration Model Codename Strategy
1 "Model-v1" Iteration 1 - Accepting Default Parameters
2 "Model-v2" Iteration 2 - Mixing It Up
3 "Model-v3" Iteration 3 - Trial & Error

Hyperparameter Optimizaiton

Below is a description for each Hyperparameter that can be tuned in the DeepRacer Console:

  • Batch Size: As the agent goes around the track it collects images. The batch size is the number of experience or images that will be incorporated for each training step. The larger the size the more stable training will be. Default size is 64.
  • Epochs: The number of times to go through the training data and update the weights/values of the model. Default epochs set is 3.
  • Learning Rate: Size of updates the model makes during each training cycle. This parameter must be edited carefully since too large of a number may prevent convergence and too small may result in getting stuck at the local minima. Default rate is 0.0003.
  • Entropy: As the agent drives around the track, entropy decides how many random actions the agent may take. Needed to allow the agent to explore the space or environment it's in. Ideally good to have a larger entropy in the beginning, then decrease it later so the agent learns from previous actions better. Default value is 0.01.
  • Discount Factor: Number of steps the agent should look ahead when it's trying to make a decision through a training cycle. This factors into the action the agent takes at any current time. Example the setting of 0.999 means the agent will look ahead 1000 future steps. Default value is 0.999.
  • Loss Type: Used to evaluate the prediction results vs. ground truth. It's what the agent must optimize to update the weights of the model and ultimately improve prediction what next action the agent should take. Default value is Huber loss.
  • Number of Episodes between Policy Updates: Number of episodes should the agent run in between model updates. The more episodes or laps around the track, the more experience data available for the model during training. Default value is 20.

About

AWS 2020 re:Invent DeepRacer Competition work and GWU Cloud Computing Project

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages