Skip to content

In this project we evaluated how far modern AlphaZero type algorithms are from prefect solution.

Notifications You must be signed in to change notification settings

Rejwana/Analyze_Lc0_move_selection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analyze_Lc0_move_selection

From zero human knowledge, AlphaZero learns to play at a superhuman level of performance in different strategy games. While the AlphaZero algorithm was demonstrated to work for board games, its applications may go beyond. For applications such as drug design where a small deviation from the exact result may cost a huge loss, we need exact solutions. However, it is still an open question if AlphaZero-like algorithms can be applied to find exact solutions. To answer this question we have to investigate how these modern programs learn to play, especially in cases where we have exact solutions to compare with. In this project, we investigated the gap between a strong AlphaZero-style player and perfect play using chess endgame tablebases. First, we evaluated perfect play prediction accuracy for the AlphaZero-style Leela Chess Zero program under different settings, including using different NN snapshots, and also between the raw network policy and MCTS with different simulation budgets. Detailed findings can be found here