ideas.txt

- showdown as simulation environment
- how to implement mega evolution and z moves?
    - have specific preconditions and can only be triggered once
    - separate action dimension, which is ignored when invalid?
        - works for mega, not for z
    - for z: available moves * 4
- count forfeit wins?
    - if prematurely forfeited (eg connection loss) -> noise
    - opponents often forfeit game when in unfavorable state -> signal loss if not counted
    - turn threshold for count?
- smogon rules
- infinite battle
    - eg switch-in loop
    - average duration: 60 turns
    - smogon endless battle clause
    - possible solution: tie after 500 turns
- asynchronous rl/multiple agents (ppo, a3c, ...)
    - ape-x?
- self-play rl?
    - AI agent probably faster than humans on average
- play against humans? is there an opponent simulator?
- random teams
    - separate team building agent?
- elo rating as performance measure?
    - starts with 1000
- ppo
    - nn architecture
         - IN: 40,000 -> H1: 4000 -> H2: 1000 -> H3: 200 -> OUT: 10