Spark - Install Spark on Windows (PySpark)
- Dattatray Shinde
- Jul 24, 2021
- 1 min read

Welcome to another post on spark. Sometimes we have to create a setup to try out few things quickly.
If you want to fastly do some exeriments with apache spark then just follow the following simple steps to do the setup of pyspark on windows.
1. Download and install Java
Download Link-
Path Setup -
Add path to system environment variables
C:\Program Files\Java\jdk1.8.0_91\bin
2. Download and install Anaconda (Contains Python and Jupyter Notebook)
Download Link-
Path Setup -
Add path to system environment variables while installing anaconda.
3. Install Spark
Download Link-
Path Setup -
Add the path of extracted spark folder to the system environment variables.
3. Install Hadoop
Just create a Hadoop folder on c drive.
4. Download winutils - required for spark on windows
Download Link-
Download and save it inside the bin folder of Hadoop folder created already in the previous step.
Path Setup -
Add bin path to system environment variables
5. Start pyspark
pyspark
Type above command and check the output
6. Setup for spark on Jupyter notebook
export PYSPARK_DRIVER_PYTHON=ipython3
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
Use the following command to start the spark through the command line.
pyspark --master local[4]
That's it! Just run the above command and you will see the following output on the cmd window and jupyter notebook will get opened in the new browser window.
Komentarze