Skip to content

02‐06‐2024 Weekly Tag Up

Joe Miceli edited this page Feb 7, 2024 · 1 revision

Attendees

  • Joe
  • Chi-Hui

Updates

  • Changing reward didn't have much impact on excess speed model
  • We probably need to update reward to keep max speed above a certain level as well
    • Punish agents for letting max speed go to 0
    • Applying lower bound of 5.0 for now
  • Deployment Model
    • 2 single objective models trained with PS --> 2 different models
    • 9 agents means there are 2^9 deployment possibilities
    • Still try to minimize dual objectives
    • This would change the problem to a "deployment" problem
    • We will keep this in mind for future, there could be some interesting applications to our work
      • Maybe look into changing deployment during dataset generation for batch offline learning
      • Or changing deployment for learning the policy

Next Steps

  • Retrain single objective max speed model to evaluate how the performance changed with the new reward
    • Are intersections still going to 0 throughput?
  • Review old logs to see if policies were learned that stopped all cars in intersection
    • Or is it an issue that was introduced with parameter sharing
  • Look at data from excess speed policy with other thresholds
    • Did the threshold make a difference on number of stopped cars or not?
Clone this wiki locally