PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment. PySpark supports most of Sparkβs features such as Spark SQL, DataFrame, Streaming, MLlib (Machine Learning) and Spark Core.
phricardorj / pyspark-study Goto Github PK
View Code? Open in Web Editor NEWπ | My PYSPARK studies. PySpark is an interface for Apache Spark in Python.