Home

PySpark

Pyspark

py4j.protocol.Py4JJavaError occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe

  • Link Pycharm with PySpark: Instructions
    • On my mac (MacOS 14.0), installing apache-spark with Homebrew resulted in a different path to Spark: export SPARK_HOME=/opt/homebrew/Cellar/apache-spark/3.5.0/libexec
  • I still had the error (py4j.protocol.Py4JJavaError occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe) after following the above instructions, which I managed to resolve by setting a different environment variable in Pycharm as suggested here: PYSPARK_PYTHON=python