PySpark is the Python API written in python to support Apache Spark where as Apache Spark is a distributed framework that can handle Big Data analysis in a parallel fashion. Pyspark is faster than python's library pandas and has many features like processing data with SQL as well HiveQL, parallel processing on clusters with RDD & many more.
$ pip install pyspark
- Prasad Patil