-
Notifications
You must be signed in to change notification settings - Fork 0
06‐25‐2024 Weekly Tag Up
Joe Miceli edited this page Jun 26, 2024
·
1 revision
- Chi Hui
- Joe
- Trained 3 different models on sumo environment
- Used 9 agent env
- Self-play (actor-critic with parameter sharing)
- ASL7, ASL10, and Queue models
- All perform well and do not result in "gridlock"
- SUMO can be confusing to people that haven't seen it before, we may need to use Overcooked env
- People that use Overcooked usually use PPO to train self-play agents
- Overcooked is commonly used for ZSC
- Usually just 1 agent and 1 partner though
- HIRO group overcooked env: https://github.com/HIRO-group/overcooked_ai
- Requires updates to support different kinds of experts
- Could also consider human coordinator application
- Human injects actions/policy into the environment for a few steps (e.g. turn all lights green in 1 direction for a little bit)
- Would we still get the same performance?
- This would only be necessary to study if our other contributions don't seem like enough
- Need to take self-play models and evaluate them in mixed coordination scenarios
- Randomly place agents in the environment and evaluate the performance of the middle agent
- This agent is directly impacted by all others in the environment so it is most likely to perform poorly when changing the environment
- There are many combinations (3^9) but we just need to evaluate a few
- Randomly place agents in the environment and evaluate the performance of the middle agent
- We need to start thinking about our algorithm
- FCP lets agent see 3 different scenarios (3 different levels of partner players)
- Our application uses partner players with different expertise
- This is probably more related to population play
- World Models application
- Joe to spend more time looking into this as well as the potential application on our experiments
- Will also look at implementation to see how challenging it would be to apply
- Repo here: https://github.com/zacwellmer/WorldModels