HOTEL RESERVATION PREDICTION

OVERVIEW

The project aims to develop a machine learning model capable of predicting the cost per room in a hotel reservation. The creation and training of the model were carried out using Amazon SageMaker, and the training data is stored in DynamoDB. After training, the model is stored in Amazon S3. To enable the utilization of the model, an API service was developed using Python and the FastAPI framework to load the trained model from S3 and perform cost per room inference. The deployment of the service is conducted through AWS Elastic Beanstalk.

ARCHITECTURE

Warning

It is imperative for users to deploy their own application on AWS using their own credentials to ensure compliance and security. This ensures that users have full control over their application's environment and data, facilitating customization and enhancing security measures.

MODEL TRAINING

The dataset used is the Hotel Reservations Dataset, which contains various information about thousands of reservations, including the price per room, our target variable for analysis. The dataset is stored in DynamoDB and retrieved within SageMaker.

Important

The data preprocessing included adding a new label column to classify the price into three numerical categories:

$1$ Price LTE to $85$ | $2$ Price GT $85$ and LT $115$ | $3$ Price GTE to $115$

Subsequently, the original column containing the price is removed. Additionally, exploratory data analysis was conducted to identify the most relevant correlations for training.

Note

KNN, Logistic Regression, and XGBoost models were tested using both the original dataset and the dataset with selected relevant columns.

The results with the selected dataset were:

Model	Precision	Recall	F1-score	Accuracy
KNN	$0.62$	$0.62$	$0.62$	$0.62$
Linear Learner	$0.57$	$0.59$	$0.56$	$0.57$
XGBoost	$0.82$	$0.82$	$0.82$	$0.82$
XGBoost Oversampling	$0.83$	$0.83$	$0.83$	$0.82$
XGBoost Undersampling	$0.83$	$0.83$	$0.83$	$0.83$

The results with the original dataset were:

Model	Precision	Recall	F1-score	Accuracy
Linear Learner	$0.61$	$0.62$	$0.61$	$0.61$
XGBoost	$0.84$	$0.84$	$0.84$	$0.837$
XGBoost Oversampling	$0.85$	$0.85$	$0.85$	$0.84$
XGBoost Undersampling	$0.83$	$0.83$	$0.83$	$0.835$

Based on the results, it was observed that the XGBoost model demonstrated the best performance with both datasets. Strategies such as oversampling, undersampling, and hyperparameter tuning were applied to enhance the model. It was concluded that the optimal performance of this model was achieved using the original dataset with oversampling, resulting in an accuracy of $84$%.

API IMPLEMENTATION

The Hotel Reservation Prediction API was developed using the FastAPI framework, leveraging its efficiency. Users can utilize Swagger to easily submit reservation details such as the number of adults, children, nights of stay, and lead time. The API returns the predicted class for the reservation, indicating the corresponding price range.

Method	Endpoint	Description
POST	/api/v1/predict	Submits data for prediction

The inference process comprises several sequential steps. Firstly, incoming parameters are received and subjected to validation using Pydantic, a Python library designed for data validation. Following this, categorical parameters undergo conversion into a binary numerical format to ensure compatibility with the model. Subsequently, the XGBoost model executes prediction operations on the transformed input data, facilitated by its ability to handle structured data effectively. Post-prediction, both the input parameters and the resultant prediction are recorded in the DynamoDB table, facilitating traceability and further analysis. Finally, the API response encapsulates the predicted class determined by the model, thus completing the inference process.

Caution

Credentials should remain local to your environment only. Never expose your credentials in any part of the code, such as in source files, comments, or commit history. Instead, use environment variables or secure secret management tools to manage and access your credentials securely.

AUTHORS

Giovane Iwamoto | Gustavo Serra | Isabela Buzzo | Leandro Pereira

Giovane Hashinokuti Iwamoto - Computer Science student at UFMS - Brazil - MS

I am always open to receiving constructive criticism and suggestions for improvement in my developed code. I believe that feedback is an essential part of the learning and growth process, and I am eager to learn from others and make my code the best it can be. Whether it's a minor tweak or a major overhaul, I am willing to consider all suggestions and implement the changes that will benefit my code and its users.

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
backend/src		backend/src
data		data
docs		docs
notebooks		notebooks
sample		sample
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HOTEL RESERVATION PREDICTION

OVERVIEW

ARCHITECTURE

MODEL TRAINING

API IMPLEMENTATION

AUTHORS

About

Releases

Packages

Languages

License

GiovaneIwamoto/hotel-reservation-prediction

Folders and files

Latest commit

History

Repository files navigation

HOTEL RESERVATION PREDICTION

OVERVIEW

ARCHITECTURE

MODEL TRAINING

API IMPLEMENTATION

AUTHORS

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages