Showcasing notebooks and codes of how to use Spark NLP in Python and Scala.
$ java -version
# should be Java 8 (Oracle or OpenJDK)
$ conda create -n sparknlp python=3.6 -y
$ conda activate sparknlp
$ pip install spark-nlp==2.5.0 pyspark==2.4.4
If you want to experience Spark NLP and run Jupyter examples without installing anything, you can simply use our Docker image:
1- Get the docker image for spark-nlp-workshop:
docker pull johnsnowlabs/spark-nlp-workshop
2- Run the image locally with port binding.
docker run -it --rm -p 8888:8888 -p 4040:4040 johnsnowlabs/spark-nlp-workshop
3- Open Jupyter notebooks inside your browser by using the token printed on the console.
http://localhost:8888/
- The password to Jupyter notebook is
sparknlp
- The size of the image grows everytime you download a pretrained model or a pretrained pipeline. You can cleanup
~/cache_pretrained
if you don't need them. - This docker image is only meant for testing/learning purposes and should not be used in production environments. Please install Spark NLP natively.
https://github.com/JohnSnowLabs/spark-nlp
Take a look at our official spark-nlp page: http://nlp.johnsnowlabs.com/ for user documentation and examples
If you find any example that is no longer working, please create an issue.
Apache Licence 2.0