AWS DeepRacer for re:Invent 2020

This is an informal log of my exploration of AWS DeepRacer training. My best iteration was submitted for the AWS 2020 re:Invent DeepRacer Competition. All of the work was also submitted as a final project for a graduate Cloud Computing course for the M.S. Data Science program at George Washington University.

Reward Functions

Iteration	Model Codename	Strategy
1	"Model-v1"	Iteration 1 - Accepting Default Parameters
2	"Model-v2"	Iteration 2 - Mixing It Up
3	"Model-v3"	Iteration 3 - Trial & Error

Hyperparameter Optimizaiton

Below is a description for each Hyperparameter that can be tuned in the DeepRacer Console:

Batch Size: As the agent goes around the track it collects images. The batch size is the number of experience or images that will be incorporated for each training step. The larger the size the more stable training will be. Default size is 64.
Epochs: The number of times to go through the training data and update the weights/values of the model. Default epochs set is 3.
Learning Rate: Size of updates the model makes during each training cycle. This parameter must be edited carefully since too large of a number may prevent convergence and too small may result in getting stuck at the local minima. Default rate is 0.0003.
Entropy: As the agent drives around the track, entropy decides how many random actions the agent may take. Needed to allow the agent to explore the space or environment it's in. Ideally good to have a larger entropy in the beginning, then decrease it later so the agent learns from previous actions better. Default value is 0.01.
Discount Factor: Number of steps the agent should look ahead when it's trying to make a decision through a training cycle. This factors into the action the agent takes at any current time. Example the setting of 0.999 means the agent will look ahead 1000 future steps. Default value is 0.999.
Loss Type: Used to evaluate the prediction results vs. ground truth. It's what the agent must optimize to update the weights of the model and ultimately improve prediction what next action the agent should take. Default value is Huber loss.
Number of Episodes between Policy Updates: Number of episodes should the agent run in between model updates. The more episodes or laps around the track, the more experience data available for the model during training. Default value is 20.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.idea		.idea
iterations		iterations
.gitattributes		.gitattributes
Final_Project_Presentation.pptx		Final_Project_Presentation.pptx
README.md		README.md
default_rewards.py		default_rewards.py
steamclient64.dll		steamclient64.dll

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AWS DeepRacer for re:Invent 2020

Reward Functions

Hyperparameter Optimizaiton

About

Releases

Packages

Languages

mwilchek/DeepRacer-2020

Folders and files

Latest commit

History

Repository files navigation

AWS DeepRacer for re:Invent 2020

Reward Functions

Hyperparameter Optimizaiton

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages