From Data Science to MLOPs workshop
For this workshop we are going to work with the following dataset:
https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. n the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].
- ID number
- Diagnosis (M = malignant, B = benign) 3-32)
Ten real-valued features are computed for each cell nucleus:
a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)
Firt we need to create a virtual environment for the project, to keep track of every dependency, it is also useful to use and explicit version of Python
Install the package for creating a virtual environment:
$ pip install virtualenv
Create a new virtual environment
$ virtualenv venv
Activate virtual environment
$ source venv/bin/activate
Now with the virtual environment we can install the dependencies written in requirements.txt
$ pip install -r requirements.txt
After we have install all the dependencies we can now run the script in code/train.py, this script takes the input data and outputs a trained model and a pipeline for our web service.
$ python code/train.py
Finally we can test our web application by running:
$ flask run -p 5000
Now that we have our web application running, we can use the Dockerfile to create an image for running our web application inside a container
$ docker build . -t from_ds_to_mlops
And now we can test our application using Docker
$ docker run -p 8000:8000 from_ds_to_mlops
Test by using the calls in tests/example_calls.txt from the terminal