Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider implementing interactive reinforcement learning step with Stockfish #174

Open
fshcat opened this issue Feb 24, 2022 · 0 comments
Open
Assignees
Labels
AI/ML Team AI, CV High Priority Need to be addressed asap

Comments

@fshcat
Copy link
Collaborator

fshcat commented Feb 24, 2022

This paper uses pre-prepared algorithms to guide a reinforcement learning agent and speed up its start up.
https://arxiv.org/pdf/2008.12001.pdf
We're currently giving the neural network a warm-start by training directly with stockfish information, but this may produce training data that's too different from the MCTS priors. If we instead use stockfish "advice" within MCTS it could produce more similar data and thus lead to more efficient training once we start training with only MCTS. @JuddBE also proposed we could do 3 phases: start with our current method, then use stockfish as a trainer, then move on to only MCTS.

Implementing would depend on how exactly we want to use stockfish advice. Most likely the main changes would be having MCTS take value supplier and priors supplier functions as parameters, and then implementing those functions as needed for the stockfish trainer.

@fshcat fshcat added the AI/ML Team AI, CV label Feb 24, 2022
@nashirj nashirj added the High Priority Need to be addressed asap label Apr 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI/ML Team AI, CV High Priority Need to be addressed asap
Projects
None yet
Development

No branches or pull requests

4 participants