-
Notifications
You must be signed in to change notification settings - Fork 0
02‐06‐2024 Weekly Tag Up
Joe Miceli edited this page Feb 7, 2024
·
1 revision
- Joe
- Chi-Hui
- Changing reward didn't have much impact on excess speed model
- We probably need to update reward to keep max speed above a certain level as well
- Punish agents for letting max speed go to 0
- Applying lower bound of 5.0 for now
- Deployment Model
- 2 single objective models trained with PS --> 2 different models
- 9 agents means there are 2^9 deployment possibilities
- Still try to minimize dual objectives
- This would change the problem to a "deployment" problem
- We will keep this in mind for future, there could be some interesting applications to our work
- Maybe look into changing deployment during dataset generation for batch offline learning
- Or changing deployment for learning the policy
- Retrain single objective max speed model to evaluate how the performance changed with the new reward
- Are intersections still going to 0 throughput?
- Review old logs to see if policies were learned that stopped all cars in intersection
- Or is it an issue that was introduced with parameter sharing
- Look at data from excess speed policy with other thresholds
- Did the threshold make a difference on number of stopped cars or not?