Data Scraping, Database Modelling & Data Storing of
NBA2K24 Players Data
A project assignment for 2023 Database Lab Assistant Candidate Selection at Bandung Institute of Technology
- Data Description
- Program Specification
- How to Use
- JSON Structure
- Database Schema
- ERD to Relational Model Tranlation
- Author
This project involves scraping data from the website 2kratings, an unofficial website that provides NBA2K24's data. The data that will be scraped are all NBA2K24 players' name, team, position, height, overall rating, three point rating, and dunk rating
.
I really love sport, especially basketball. NBA 2K able to give me the experience to play in one of the best if not the best basketball league in the world. This data is particularly interesting to me because it allows me to explore and analyze the virtual representation of NBA players within the game.
As an avid basketball fan, I find it fascinating to compare the in-game ratings of players with their real-life performances and skills. The NBA 2K series is renowned for its attention to detail, aiming to replicate the strengths and weaknesses of each player accurately. Scrutinizing the ratings of players' three-point shooting and dunking abilities, among other attributes, can provide insights into how the game developers perceive the skills and attributes of different NBA players.
Moreover, having access to this data enables me to perform various analyses and visualizations. For instance, I can create visual representations of player ratings across teams or positions, identify the highest-rated players in the game, or even track changes in ratings through different game updates. This data scraping project will not only satisfy my curiosity but also allow me to delve deeper into the intricacies of NBA 2K and gain a deeper understanding of the game's mechanics and player rankings.
Overall, scraping the player data from 2kratings presents an exciting opportunity to combine my passion for basketball with my interest in data analysis. It allows me to explore the virtual realm of NBA 2K24 and gain valuable insights into the game's player ratings, contributing to a richer understanding of the sport I love.
The program will scrape the data from the website 2kratings and store it in a MariaDB database. The program is written in Python 3.9.6 and will use some libraries. The scraper will scrape the data from every team's website with the base url as https://www.2kratings.com/teams/ and then follows with the corresponding team. For example, the scraper will scrape the data from the Los Angeles Lakers website with the url https://www.2kratings.com/teams/los-angeles-lakers. The scraper's script will be stored in the Data Scraping/src
directory named scraper.py
. After the data is scraped, the data will be stored in a JSON file named database.json
in the Data Scraping/data
directory.
The data storing's script is stored in the Data Storing/src
directory named storing.py
. Before runing the data storing's script, the database must be created first. The database schema is represented by the following ERD and has been translated into a relational model. After creating the database, the data storing's script can be run. The script will make a connection to the database, creating all tables, and then store the data from the JSON file into the MariaDB database.
Prerequisites
Libraries Used
To run this project, you will need to add the following environment variables to your .env
file
# DB Details
DB_USER=root # your username
DB_PASS= # your password
DB_HOST=localhost # your host (to run locally use localhost)
DB_PORT=3306 # your port (default: 3306)
DB_NAME=nba2k24_db # your database name
- Clone this repository (first-time use only)
git clone https://github.com/kennypanjaitan/Seleksi-2023-Tugas-1.git
- Install all the required libraries
pip install -r requirements.txt
- Run the data scraper's script
python 'Data Scraping/src/scaper.py'
- Connect to your MariaDB server and create a new database
CREATE DATABASE nba2k24_db;
USE nba2k24_db;
- Run the data storing's script
python 'Data Storing/src/storing.py'
There is a JSON File that will be generated by the scraper's script. The JSON file will be stored in the Data Scraping/data
directory named database.json
. The JSON file will have the following structure:
{
// List of players
"players": [
{
"name": Player's full name,
"teams": Player's team,
"position": Player's primary and secondary position (if any),
"heightFeet": Player's height in feet,
"heightInches": Player's height in inches,
"overall": Player's overall rating,
"three": Player's three point rating,
"dunk": Player's dunk rating,
}
],
// List of teams
"teams": [
{
"teamID": Team's ID by abbreviation,
"teamName": Team's full name,
"homeBase": Team's home base (city),
}
],
// List of positions
"positions": [
{
"posName": Position's full name,
"posAbbr": Position's abbreviation,
}
]
}
The database schema is represented by the following Entity-Relationship Diagram (ERD),
Additionally, the ERD is translated into a relational model, providing a structured representation of the database tables and their corresponding attributes.
The ERD is translated into a relational model, providing a structured representation of the database tables and their corresponding attributes. The explanation of the translation process is as follows:
- The entity
Team
is translated into a table namedteams
. The table has the following attributes:teamID
as the primary key (using the team's abbreviation)teamName
as the team's full namehomeBase
as the team's home base (city)
- The entity
Position
is translated into a table namedpositions
. The table has the following attributes:posID
as the primary key (using the position's abbreviation)posName
as the position's full name
- The entity
Player
is translated into a table namedplayers
. The table has the following attributes:playerID
as the primary key (using a generated UUID)playerName
as the player's full nameteamID
as the foreign key referencing theteamID
attribute in theteams
tableheightFeet
as the player's height in feetheightInches
as the player's height in inchesoverall
as the player's overall ratingthree
as the player's three point ratingdunk
as the player's dunk rating
- The relationship
Player
that can have multiple positionPosition
is translated into a table namedplayerPositions
. The table has the following attributes:playerID
as the foreign key referencing theplayerID
attribute in theplayers
tableposID
as the foreign key referencing theposID
attribute in thepositions
table
The player's colour type view is created to make it easier to identify the player rating's category.
Kenny Benaya Nathan - 13521023