This is the tutorial for topic modelling using PySpark and Spark NLP libraries. This code could be seen as a complement of Topic Modelling with PySpark and Spark NLP blog post on medium. You could refer to this blog post for more elaborated explanation on what topic modelling is, how to use Spark NLP for NLP pipelines and perform topic modelling with PySpark.
The code shows how to install PySpark and Spark NLP libraries and download a Kaggle dataset to Google Collaboratory. It also illustrates how to build the NLP pipeline with Spark NLP and train a topic model with PySpark.