Skip to content

Latest commit

 

History

History
16 lines (8 loc) · 3.32 KB

README.md

File metadata and controls

16 lines (8 loc) · 3.32 KB

This project is an implementation of several graph neural network models for link prediction on the weighted, directed point-differential graph for the 2013-2019, 2021 seasons for NBA and NCAA* basketball. Open src/models.py to select a year and day range for testing or to adjust hyperparameters. Run src/models.py to train and test a model. Predictions for each model for the 2021 season are posted in predictions. The prediction printed is (Home Score - Away Score).

The input to the models are graphs representing the state of the season on a given day: The Offense/Defense graph has nba offenses and defenses as nodes, and edges representing interactions between them. The Vegas graph has teams as nodes and its weighted directed edges represent Vegas point spreads. The edgeweights in the Offense/Defense graph are computed according to Four Factors statistics from basketball reference game boxscores via sportsipy as in Ranking NCAA Basketball Teams Using the Google PageRank Algorithm (2015 Stanek, Taylor). As a preproccessing step, All graphs are row normalized and The Oracle Adjustment is applied as described in An Oracle Method to Predict NFL Games (2012 Balreira, Miceli, Tegtmeyer) to enhance the random walks on the graphs.

Next, node2vec (2016, Grover, Leskovec) is applied to the graphs to compute a feature representation of all offense and defense nodes, and all teams in the Vegas graph. Then the graphs along with the node2vec representations are passed to one of 4 graph convolutional layers described in Diffusion Convolutional Neural Network (2016, Atwood, Towsely), Design Space for Graph Neural Networks (2020 Leskovec, Ying, You), Graph Neural Networks with Convolutional ARMA Filters (2021 Bianchi, Grattarola, Livi, Alippi), and How Powerful are Graph Neural Networks? (2019, Hu, Leskovec, Jegelka, Xu). These layers are implemented using spektral.

Now, for a given game, the new representations of both offenses and defenses, along with both teams' representation in the Vegas graph, are passed to a regression neural network to predict the score differential of a game. The model is tested during the selected year and day range, and its win percentage against the spread and against the moneyline are printed along with its MSE for the games in the testing range.

Set up the environment using deepnba.yml:

conda env create -f deepnba.yml

*Vegas Lines are not used in NCAA models. To update data use this fork of sportsipy: https://github.com/joewilaj/sportsipy