Consider implementing interactive reinforcement learning step with Stockfish #174

fshcat · 2022-02-24T01:12:59Z

This paper uses pre-prepared algorithms to guide a reinforcement learning agent and speed up its start up.
https://arxiv.org/pdf/2008.12001.pdf
We're currently giving the neural network a warm-start by training directly with stockfish information, but this may produce training data that's too different from the MCTS priors. If we instead use stockfish "advice" within MCTS it could produce more similar data and thus lead to more efficient training once we start training with only MCTS. @JuddBE also proposed we could do 3 phases: start with our current method, then use stockfish as a trainer, then move on to only MCTS.

Implementing would depend on how exactly we want to use stockfish advice. Most likely the main changes would be having MCTS take value supplier and priors supplier functions as parameters, and then implementing those functions as needed for the stockfish trainer.

fshcat assigned nashirj, ryPattillo and JuddBE Feb 24, 2022

fshcat added the AI/ML Team AI, CV label Feb 24, 2022

nashirj added the High Priority Need to be addressed asap label Apr 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider implementing interactive reinforcement learning step with Stockfish #174

Consider implementing interactive reinforcement learning step with Stockfish #174

fshcat commented Feb 24, 2022

Consider implementing interactive reinforcement learning step with Stockfish #174

Consider implementing interactive reinforcement learning step with Stockfish #174

Comments

fshcat commented Feb 24, 2022