In case if you do not know how to setup Spark on Mac, please refer to the previous story.
Photo by Pradyumna Doddala using Photopea
Now that you have Spark installed and built on your Mac. Let us make few changes to get the IDE running.
Steps
Set the variable in the bash_profile
sudo vim ~/.bash_profile
vim editor
export SPARK_HOME=/usr/local/spark
export PATH=$PATH:$SPARK_HOME
Now open the PyCharm.
Create a new project, and use Pure Python template.
Now lets create a python file named whatever-you-wanted-to-name.
Add the Spark python library to the interpreter.
Steps for adding the /usr/local/spark/python as the library for the Project Interpreter.
The Word Count Program
For the word count program you would need a text file.
First create a sample text file, I am gonna give some part of the text that I already wrote in this post as the input.
Finally the program,
import os
os.environ["SPARK_HOME"] = "/usr/local/spark"
from operator import add
from pyspark import SparkContext
if __name__ == "__main__":sc = SparkContext(appName="PythonWordCount")lines = sc.textFile("sample.txt", 1)counts = lines.flatMap(lambda x: x.split(' ')) \.map(lambda x: (x, 1)) \.reduceByKey(add)output = counts.collect()for (word, count) in output:print("%s: %i" % (word, count))
sc.stop()
Run
The first run is mostly a disaster, because we miss many little things.
So if we can take a quick glance at the error, it says that a module named py4j.java_gateway is missing.
So we have to refer that to the Interpreter.
Again open the Preferences, open the current Interpreter settings and add the lib named py4j-0.9-src.zip
Adding the missing lib.
Now lets rerun the code.
We can see in the below screen shot, the words and the respective count are visible.
Final Run.
I hope this keeps you busy for the next few days on trying the amazing Apache Spark.
If you’ve reached this, you’ve made it!! Have a great day!
Clap away if this helped you out. It encourages me to write more posts. And thanks for the support.
Prady | @pradyumna_d | “File Your Cryptocurrency Taxes Using BearTax!”