top of page
Search

Spark - Install Spark on Windows (PySpark)

  • Writer: Dattatray Shinde
    Dattatray Shinde
  • Jul 24, 2021
  • 1 min read



Welcome to another post on spark. Sometimes we have to create a setup to try out few things quickly.


If you want to fastly do some exeriments with apache spark then just follow the following simple steps to do the setup of pyspark on windows.

1. Download and install Java


Download Link-


Path Setup -

Add path to system environment variables

C:\Program Files\Java\jdk1.8.0_91\bin

2. Download and install Anaconda (Contains Python and Jupyter Notebook)


Download Link-


Path Setup -

Add path to system environment variables while installing anaconda.


3. Install Spark


Download Link-


Path Setup -

Add the path of extracted spark folder to the system environment variables.

3. Install Hadoop

Just create a Hadoop folder on c drive.

4. Download winutils - required for spark on windows


Download Link-


Download and save it inside the bin folder of Hadoop folder created already in the previous step.


Path Setup -


Add bin path to system environment variables

5. Start pyspark

pyspark

Type above command and check the output

6. Setup for spark on Jupyter notebook

export PYSPARK_DRIVER_PYTHON=ipython3
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"

Use the following command to start the spark through the command line.

pyspark --master local[4]

That's it! Just run the above command and you will see the following output on the cmd window and jupyter notebook will get opened in the new browser window.


 
 
 

Komentarze


  • Google+ Social Icon
  • Facebook Social Icon
  • LinkedIn Social Icon
  • Twitter Social Icon

© 2023 by Dattatray Shinde

bottom of page