With "pyspark" script, you have a "spark" object as follows:
SparkSession available as 'spark'.
>>> spark
<pyspark.sql.session.SparkSession object at 0x10d6dd898>
>>>
But with "python" or Jupyter Notebook (IPython Notebook), there's no "spark" object as follows:
>>> spark
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'spark' is not defined
>>>
Add the following to your ".bash_profile":
export PYTHONSTARTUP="${SPARK_HOME}/python/pyspark/shell.py"
It's working but I'm not sure this is a right approach.
No comments:
Post a Comment