Skip to content

02‐16‐2024 Constraint Setting with New Single Objective Models

Joe Miceli edited this page Feb 19, 2024 · 2 revisions

Attendees

  • Chi-Hui
  • Joe

Discussion

  • Updating Normalize Dataset function
  • Change the calculation of c1 to reflect rollout averages from excessive speed and queue length policies
  • Upper bound U1 is avg reward per step from excessive speed policy
    • When evaluating excessive speed policy according to excessive speed reward
  • Lower bound L1 is avg reward per step from queue length policy
    • When evaluating queue length policy according to excessive speed reward
  • Constraint c1 is now (U1 - L1) * ratio + L1
  • Leave rest of the algorithm the same
Clone this wiki locally