Skip to content

Commit

Permalink
Adds support for deploying PySpark models (#196)
Browse files Browse the repository at this point in the history
* saving pyspark looks like it works

* more work on pyspark container. untested

* updated docker files

* fixed imports

* initialization of pyspark docker container works

* pyspark deploy integration tests pass

* got mllib tests passing again

* pyspark deploy is working

* download spark if not present

* format code

* format code

* fixed name of pyspark dockerfile

* cleanup

* moved failing clipper_manager test to end

* added data dependence

* format code

* Fixed tests

* addressed review comments.

* fix labels issue

* fixed deploy_pyspark labels

* fixed label arg order in deploy_pyspark_model
  • Loading branch information
dcrankshaw authored and Corey-Zumar committed Jun 5, 2017
1 parent fc7e2be commit 946fc1d
Show file tree
Hide file tree
Showing 14 changed files with 848 additions and 147 deletions.
22 changes: 22 additions & 0 deletions PySparkContainerDockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
FROM clipper/py-rpc:latest

MAINTAINER Dan Crankshaw <dscrankshaw@gmail.com>

COPY containers/python/python_container_conda_deps.txt /lib/

RUN curl -o /spark.tgz https://d3kbcqa49mib13.cloudfront.net/spark-2.1.1-bin-hadoop2.7.tgz \
&& cd / && tar zxf /spark.tgz && mv /spark-2.1.1-bin-hadoop2.7 /spark \
&& echo deb http://ftp.de.debian.org/debian jessie-backports main >> /etc/apt/sources.list \
&& apt-get update --fix-missing \
&& apt-get install -yqq -t jessie-backports openjdk-8-jdk \
&& conda install -y --file /lib/python_container_conda_deps.txt \
&& pip install findspark

COPY containers/python/pyspark_container.py containers/python/pyspark_container_entry.sh /container/
COPY clipper_admin/ /lib/clipper_admin/

ENV SPARK_HOME="/spark"

CMD ["/container/pyspark_container_entry.sh"]

# vim: set filetype=dockerfile:
5 changes: 3 additions & 2 deletions PythonContainerDockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@ FROM clipper/py-rpc:latest

MAINTAINER Dan Crankshaw <dscrankshaw@gmail.com>

COPY containers/python/python_container.py containers/python/python_container_entry.sh /container/
COPY containers/python/python_container_conda_deps.txt /lib/
RUN conda install -y --file /lib/python_container_conda_deps.txt

COPY containers/python/python_container.py containers/python/python_container_entry.sh /container/
COPY clipper_admin/ /lib/clipper_admin/

RUN conda install -y --file /lib/python_container_conda_deps.txt

CMD ["/container/python_container_entry.sh"]

Expand Down
1 change: 1 addition & 0 deletions bin/build_docker_images.sh
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ cd $DIR/..
docker build -t clipper/py-rpc -f ./RPCDockerfile ./
time docker build -t clipper/noop-container -f ./NoopDockerfile ./
time docker build -t clipper/python-container -f ./PythonContainerDockerfile ./
time docker build -t clipper/pyspark-container -f ./PySparkContainerDockerfile ./
time docker build -t clipper/sklearn_cifar_container -f ./SklearnCifarDockerfile ./
time docker build -t clipper/tf_cifar_container -f ./TensorFlowCifarDockerfile ./
cd -
14 changes: 13 additions & 1 deletion bin/run_unittests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,18 @@ function run_frontend_tests {
function run_integration_tests {
echo -e "\nRunning integration tests\n\n"
cd $DIR
if [ -z ${SPARK_HOME+x} ]; then
echo "Downloading Spark"
curl -o spark.tgz https://d3kbcqa49mib13.cloudfront.net/spark-2.1.1-bin-hadoop2.7.tgz
tar zxf spark.tgz && mv spark-2.1.1-bin-hadoop2.7 spark
export SPARK_HOME=`pwd`/spark
else
echo "Found Spark at $SPARK_HOME"
fi
pip install findspark
python ../integration-tests/clipper_manager_tests.py
python ../integration-tests/deploy_pyspark_models.py
python ../integration-tests/deploy_pyspark_pipeline_models.py
python ../integration-tests/many_apps_many_models.py 2 3
}

Expand All @@ -131,11 +142,12 @@ function run_all_tests {
redis-cli -p $REDIS_PORT "flushall"
run_management_tests
redis-cli -p $REDIS_PORT "flushall"
run_integration_tests
redis-cli -p $REDIS_PORT "flushall"
run_jvm_container_tests
redis-cli -p $REDIS_PORT "flushall"
run_rpc_container_tests
redis-cli -p $REDIS_PORT "flushall"
run_integration_tests
}

if [ "$#" == 0 ]
Expand Down
Loading

0 comments on commit 946fc1d

Please sign in to comment.