Apache Spark is fast and general purpose cluster computing system. It provides high-level API’s in Java, Scala, Python and R and also an optimized engine that supports general execution graphs. It has Mlib for machine learning and Graphx for graph processing.
Here is how you can install Spark in standalone mode on Windows.
- Install Java 7 or later. Set JAVA_HOME and PATH variables as environment variables.
- Install Scala 2.10 or 2.11 and set SCALA_HOME and add %SCALA_HOME%\bin as environment variables.
- Choose Spark prebuilt package with Hadoop, download, and extract.
- Set SPARK_HOME and add %SPARK_HOME%\bin in PATH in environment variables.
- Download WinUtils.Exe and place it in any folder and set HADOOP_HOME variable as the folder path of WinUtils.Exe in an environment variable.
- Now Run the following command: spark-shell
- For Spark UI: open http://localhost:4040 in the browser.
Here is how you can install spark and setup cluster in standalone mode on windows.
- Thanks for dropping by!! Feel free to comment on this post or you can also write to me at firstname.lastname@example.org.