This project is designed to predict diamond prices using a Machine Learning model. It includes data ingestion, transformation, model training, and prediction. The project is implemented using Python and Flask .
The dataset consists of 10 independent variables (including id
):
- id: Unique identifier for each diamond.
- carat: Carat (ct.) is the unit of weight measurement used exclusively for gemstones and diamonds.
- cut: Quality of the diamond cut.
- color: Color of the diamond.
- clarity: A measure of the purity and rarity of the stone, graded by visibility under 10-power magnification.
- depth: Height (in millimeters) measured from the culet (bottom tip) to the table (flat top surface).
- table: The facet of the diamond visible when viewed face up.
- x: Diamond X dimension.
- y: Diamond Y dimension.
- z: Diamond Z dimension.
- price: Price of the given diamond.
DiamondPricePrediction/
├── .dvc
├── .github/workflows/
│ └── main.yaml
├── src/
│ ├── DiamondPricePrediction/
│ │ ├── components/
│ │ │ ├── data_ingestion.py
│ │ │ ├── data_transformation.py
│ │ │ ├── model_evaluation.py
│ │ │ └── model_training.py
│ │ ├── pipelines/
│ │ │ ├── prediction_pipeline.py
│ │ │ └── training_pipeline.py
│ │ ├── utils.py/
│ │ │ └── utils.py
│ │ ├── exception.py
│ │ ├── logger.py
├── templates/
│ ├── index.html
│ ├── predict.html
│ └── result.html
├── app.py
├── Dockerfile
├── dvc.yaml
├── init_setup.sh
├── README.md
├── requirements.txt
├── setup.py
└── template.py
-
Clone the repository:
git clone https://github.com/kunalshelke90/DiamondPricePrediction.git cd DiamondPricePrediction
-
Create a virtual environment and install dependencies(if you have linux base terminal):
bash init_setup.sh
Create a virtual environment and install dependencies:
conda create env -p Diamond_Price python=3.8 -y
conda activate Diamond_Price
pip install -r requirements.txt
-
Set up environment variables: Create a
.env
file in the project root and add the following variables:DAGSHUB_REPO_OWNER="owner_name" DAGSHUB_REPO_NAME="Repo_name" DAGSHUB_MLFLOW="True" MLFLOW_REGISTRY_URI="https://dagshub.com/repo_owner/repo_name.mlflow" AWS_ACCESS_KEY_ID= AWS_SECRET_ACCESS_KEY=
-
Start the Flask application:
python app.py
-
Access the application: Open your web browser and navigate to http://localhost:8080 or http://127.0.0.1:8080 to interact with the application.
-
Build the Docker image:
docker build -t Diamond .
-
Run the Docker container:
docker run -p 5000:5000 Diamond
-
Data Ingestion: Customize
data_ingestion.py
in thesrc/DiamondPricePrediction/components
folder to suit your data source and schema. Modify the connection settings for your Cassandra database and adjust the data loading logic insrc/mlproject/utils.py
. -
Data Transformation: Modify
data_transformation.py
in thesrc/DiamondPricePrediction/components
folder to apply different scaling methods, feature engineering techniques, or transformations according to your dataset's needs. -
Model Training: Customize
model_training.py
in thesrc/DiamondPricePrediction/components
folder to experiment with different models, hyperparameters, and evaluation metrics. You can also integrate other ML libraries like TensorFlow or PyTorch. -
Web Interface: Modify the HTML templates in the
templates/
folder to match your preferred UI design. You can add or remove input fields, change styles, and customize the prediction output format.