Pyspark with spark 2.1.0 - Python cannot be version 3.6.
To get started using spark with pyton, you can
1. Install Anaconda Python - which has all the goodies you need.
2.Download Spark and unzip into a folder.
3. After you have all these setup, next you need to issue the following command (spark only supports python 3.5)
conda create -n py35 python=3.5 anaconda
4. Goto your spark installation folder, goto "bin" and run "pyspark".
5. You probably going to get some exceptions but still should be able to run the following scripts :
from pyspark import SparkContext
sc = SparkContext.getOrCreate()
tf = sc.textFile("j:\\tmp\\data.txt")
Please make sure you have your "data.txt" pointed correctly.
This setup looks easier than it is. Spent a lot of time today trying to get it up and running.