Skip to content

kennypanjaitan/NBA2K24-data-scraper

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Data Scraping, Database Modelling & Data Storing of
NBA2K24 Players Data

A project assignment for 2023 Database Lab Assistant Candidate Selection at Bandung Institute of Technology

GitHub repo size

Table of Contents

Data Description

This project involves scraping data from the website 2kratings, an unofficial website that provides NBA2K24's data. The data that will be scraped are all NBA2K24 players' name, team, position, height, overall rating, three point rating, and dunk rating.

I really love sport, especially basketball. NBA 2K able to give me the experience to play in one of the best if not the best basketball league in the world. This data is particularly interesting to me because it allows me to explore and analyze the virtual representation of NBA players within the game.

As an avid basketball fan, I find it fascinating to compare the in-game ratings of players with their real-life performances and skills. The NBA 2K series is renowned for its attention to detail, aiming to replicate the strengths and weaknesses of each player accurately. Scrutinizing the ratings of players' three-point shooting and dunking abilities, among other attributes, can provide insights into how the game developers perceive the skills and attributes of different NBA players.

Moreover, having access to this data enables me to perform various analyses and visualizations. For instance, I can create visual representations of player ratings across teams or positions, identify the highest-rated players in the game, or even track changes in ratings through different game updates. This data scraping project will not only satisfy my curiosity but also allow me to delve deeper into the intricacies of NBA 2K and gain a deeper understanding of the game's mechanics and player rankings.

Overall, scraping the player data from 2kratings presents an exciting opportunity to combine my passion for basketball with my interest in data analysis. It allows me to explore the virtual realm of NBA 2K24 and gain valuable insights into the game's player ratings, contributing to a richer understanding of the sport I love.

Program Specification

The program will scrape the data from the website 2kratings and store it in a MariaDB database. The program is written in Python 3.9.6 and will use some libraries. The scraper will scrape the data from every team's website with the base url as https://www.2kratings.com/teams/ and then follows with the corresponding team. For example, the scraper will scrape the data from the Los Angeles Lakers website with the url https://www.2kratings.com/teams/los-angeles-lakers. The scraper's script will be stored in the Data Scraping/src directory named scraper.py. After the data is scraped, the data will be stored in a JSON file named database.json in the Data Scraping/data directory.

The data storing's script is stored in the Data Storing/src directory named storing.py. Before runing the data storing's script, the database must be created first. The database schema is represented by the following ERD and has been translated into a relational model. After creating the database, the data storing's script can be run. The script will make a connection to the database, creating all tables, and then store the data from the JSON file into the MariaDB database.

How to Use

Prerequisites
Libraries Used

Environment Variables

To run this project, you will need to add the following environment variables to your .env file

# DB Details
DB_USER=root        # your username
DB_PASS=            # your password 
DB_HOST=localhost   # your host (to run locally use localhost)
DB_PORT=3306        # your port (default: 3306)
DB_NAME=nba2k24_db  # your database name

Run Locally

  1. Clone this repository (first-time use only)
git clone https://github.com/kennypanjaitan/Seleksi-2023-Tugas-1.git
  1. Install all the required libraries
pip install -r requirements.txt
  1. Run the data scraper's script
python 'Data Scraping/src/scaper.py'
  1. Connect to your MariaDB server and create a new database
CREATE DATABASE nba2k24_db;
USE nba2k24_db;
  1. Run the data storing's script
python 'Data Storing/src/storing.py'

JSON Structure

There is a JSON File that will be generated by the scraper's script. The JSON file will be stored in the Data Scraping/data directory named database.json. The JSON file will have the following structure:

{
  // List of players
  "players": [ 
    {
      "name":         Player's full name,
      "teams":        Player's team,
      "position":     Player's primary and secondary position (if any),
      "heightFeet":   Player's height in feet,
      "heightInches": Player's height in inches,
      "overall":      Player's overall rating,
      "three":        Player's three point rating,
      "dunk":         Player's dunk rating,
    }
  ],

  // List of teams
  "teams": [
    {
      "teamID":   Team's ID by abbreviation,
      "teamName": Team's full name,
      "homeBase": Team's home base (city),
    }
  ],

  // List of positions
  "positions": [
    {
      "posName": Position's full name,
      "posAbbr": Position's abbreviation,
    }
  ]
}

Database Schema

The database schema is represented by the following Entity-Relationship Diagram (ERD),

ERD

Additionally, the ERD is translated into a relational model, providing a structured representation of the database tables and their corresponding attributes.

Relational Model

Translation from ERD to Relational Model Explanation

The ERD is translated into a relational model, providing a structured representation of the database tables and their corresponding attributes. The explanation of the translation process is as follows:

  1. The entity Team is translated into a table named teams. The table has the following attributes:
    • teamID as the primary key (using the team's abbreviation)
    • teamName as the team's full name
    • homeBase as the team's home base (city)
  2. The entity Position is translated into a table named positions. The table has the following attributes:
    • posID as the primary key (using the position's abbreviation)
    • posName as the position's full name
  3. The entity Player is translated into a table named players. The table has the following attributes:
    • playerID as the primary key (using a generated UUID)
    • playerName as the player's full name
    • teamID as the foreign key referencing the teamID attribute in the teams table
    • heightFeet as the player's height in feet
    • heightInches as the player's height in inches
    • overall as the player's overall rating
    • three as the player's three point rating
    • dunk as the player's dunk rating
  4. The relationship Player that can have multiple position Position is translated into a table named playerPositions. The table has the following attributes:
    • playerID as the foreign key referencing the playerID attribute in the players table
    • posID as the foreign key referencing the posID attribute in the positions table

Player's Colour Type View

The player's colour type view is created to make it easier to identify the player rating's category.

Player's Colour

Screenshots

Data Scraping

Scraping Data

Data Storing

Player's Position

Players

Positions

Teams

Author

Kenny Benaya Nathan - 13521023

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Languages

  • Python 100.0%