Skip to content

02‐27‐2024 Weekly Tag Up

Joe Miceli edited this page Feb 27, 2024 · 1 revision

Attendees

  • Chi-Hui
  • Joe

Updates

  • New rate-based lambda updater implemented
  • Reran experiments with constraint ratio of 0.25 and 0.75
  • In both cases, the constraint was obeyed but would be better if we were able to get mean policy closer to the constraint (i.e. behave more like queue policy)
  • Rerun experiments with different learning rate
    • Rate of 0.1
    • May also need to try 0.01
  • THEN run experiments with a new dataset
    • So more % actions come from queue length model
    • We want the mean policy to provide returns closer to the constraint
  • Still need to consider other methods
    • For a submission, we will need to provide some comparison to other methods
  • Need to think about how lambda impacts the return of the mean/current policies
    • It's challenging to connect them conceptually
    • If we can't come up with a connection, we may have to come up with a different method of updating lambda
Clone this wiki locally