From zero human knowledge, AlphaZero learns to play at a superhuman level of performance in different strategy games. While the AlphaZero algorithm was demonstrated to work for board games, its applications may go beyond. For applications such as drug design where a small deviation from the exact result may cost a huge loss, we need exact solutions. However, it is still an open question if AlphaZero-like algorithms can be applied to find exact solutions. To answer this question we have to investigate how these modern programs learn to play, especially in cases where we have exact solutions to compare with. In this project, we investigated the gap between a strong AlphaZero-style player and perfect play using chess endgame tablebases. First, we evaluated perfect play prediction accuracy for the AlphaZero-style Leela Chess Zero program under different settings, including using different NN snapshots, and also between the raw network policy and MCTS with different simulation budgets. Detailed findings can be found here
-
Notifications
You must be signed in to change notification settings - Fork 0
Rejwana/Analyze_Lc0_move_selection
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
In this project we evaluated how far modern AlphaZero type algorithms are from prefect solution.
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published