-
Notifications
You must be signed in to change notification settings - Fork 0
02‐16‐2024 Constraint Setting with New Single Objective Models
Joe Miceli edited this page Feb 19, 2024
·
2 revisions
- Chi-Hui
- Joe
- Updating Normalize Dataset function
- Change the calculation of c1 to reflect rollout averages from excessive speed and queue length policies
- Upper bound U1 is avg reward per step from excessive speed policy
- When evaluating excessive speed policy according to excessive speed reward
- Lower bound L1 is avg reward per step from queue length policy
- When evaluating queue length policy according to excessive speed reward
- Constraint c1 is now (U1 - L1) * ratio + L1
- Leave rest of the algorithm the same