Skip to content

06‐25‐2024 Weekly Tag Up

Joe Miceli edited this page Jun 26, 2024 · 1 revision

Attendees

  • Chi Hui
  • Joe

Updates

  • Trained 3 different models on sumo environment
    • Used 9 agent env
    • Self-play (actor-critic with parameter sharing)
    • ASL7, ASL10, and Queue models
      • All perform well and do not result in "gridlock"

Discussion

  • SUMO can be confusing to people that haven't seen it before, we may need to use Overcooked env
  • People that use Overcooked usually use PPO to train self-play agents
  • Overcooked is commonly used for ZSC
    • Usually just 1 agent and 1 partner though
  • HIRO group overcooked env: https://github.com/HIRO-group/overcooked_ai
    • Requires updates to support different kinds of experts
  • Could also consider human coordinator application
    • Human injects actions/policy into the environment for a few steps (e.g. turn all lights green in 1 direction for a little bit)
    • Would we still get the same performance?
    • This would only be necessary to study if our other contributions don't seem like enough

Next Steps

  • Need to take self-play models and evaluate them in mixed coordination scenarios
    • Randomly place agents in the environment and evaluate the performance of the middle agent
      • This agent is directly impacted by all others in the environment so it is most likely to perform poorly when changing the environment
      • There are many combinations (3^9) but we just need to evaluate a few
  • We need to start thinking about our algorithm
    • FCP lets agent see 3 different scenarios (3 different levels of partner players)
    • Our application uses partner players with different expertise
      • This is probably more related to population play
  • World Models application
    • Joe to spend more time looking into this as well as the potential application on our experiments
    • Will also look at implementation to see how challenging it would be to apply
Clone this wiki locally