I am trying to create a Spark Shell Program, but getting error while running it.
Below is my code which I am executing.
<code>from pyspark.sql import *
from pyspark import SparkConf
from lib.logger import Log4j
# conf = SparkConf()
# conf.set("spark.executor.extraJavaOptions", "-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=hello-spark")
if __name__ == "__main__":
spark = SparkSession.builder
.appName("Hello Spark")
.master("local[3")
.getOrCreate()
logger = Log4j(spark)
logger.info("Starting HelloSpark")
# your processing code
logger.info("Finished HelloSpark")
# spark.stop()
</code>
<code>from pyspark.sql import *
from pyspark import SparkConf
from lib.logger import Log4j
# conf = SparkConf()
# conf.set("spark.executor.extraJavaOptions", "-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=hello-spark")
if __name__ == "__main__":
spark = SparkSession.builder
.appName("Hello Spark")
.master("local[3")
.getOrCreate()
logger = Log4j(spark)
logger.info("Starting HelloSpark")
# your processing code
logger.info("Finished HelloSpark")
# spark.stop()
</code>
from pyspark.sql import *
from pyspark import SparkConf
from lib.logger import Log4j
# conf = SparkConf()
# conf.set("spark.executor.extraJavaOptions", "-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=hello-spark")
if __name__ == "__main__":
spark = SparkSession.builder
.appName("Hello Spark")
.master("local[3")
.getOrCreate()
logger = Log4j(spark)
logger.info("Starting HelloSpark")
# your processing code
logger.info("Finished HelloSpark")
# spark.stop()
Python Version: 3.12.4
<code>PS C:Sparkspark-3.5.1-bin-hadoop3> python --version
Python 3.12.4
</code>
<code>PS C:Sparkspark-3.5.1-bin-hadoop3> python --version
Python 3.12.4
</code>
PS C:Sparkspark-3.5.1-bin-hadoop3> python --version
Python 3.12.4
Java Version: 11.0.23
<code>PS C:Sparkspark-3.5.1-bin-hadoop3> Java --version
java 11.0.23 2024-04-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.23+7-LTS-222)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.23+7-LTS-222, mixed mode)
PS C:Sparkspark-3.5.1-bin-hadoop3>
</code>
<code>PS C:Sparkspark-3.5.1-bin-hadoop3> Java --version
java 11.0.23 2024-04-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.23+7-LTS-222)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.23+7-LTS-222, mixed mode)
PS C:Sparkspark-3.5.1-bin-hadoop3>
</code>
PS C:Sparkspark-3.5.1-bin-hadoop3> Java --version
java 11.0.23 2024-04-16 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.23+7-LTS-222)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.23+7-LTS-222, mixed mode)
PS C:Sparkspark-3.5.1-bin-hadoop3>
Error:
<code>Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.
Install the latest PowerShell for new features and improvements! https://aka.ms/PSWindows
PS C:Sparkspark-3.5.1-bin-hadoop3> spark-submit --properties-file C:Sparkspark-3.5.1-bin-hadoop3confspark-defaults.conf 'C:UsersJainRonitOneDrive - STCODesktopPersonalStudyCodingPyspark2-Spark-First-ProjectHelloSpark.py'
24/07/26 11:13:49 INFO SparkContext: Running Spark version 3.5.1
24/07/26 11:13:49 INFO SparkContext: OS info Windows 11, 10.0, amd64
24/07/26 11:13:49 INFO SparkContext: Java version 11.0.23
24/07/26 11:13:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/07/26 11:13:50 ERROR SparkContext: Error initializing SparkContext.
java.lang.Exception: spark.executor.extraJavaOptions is not allowed to set Spark options (was '-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark'). Set them directly on a SparkConf or in a properties file when using ./bin/spark-submit.
at org.apache.spark.SparkConf.$anonfun$validateSettings$4(SparkConf.scala:525)
at org.apache.spark.SparkConf.$anonfun$validateSettings$4$adapted(SparkConf.scala:521)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:521)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:410)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:834)
24/07/26 11:13:50 INFO SparkContext: SparkContext is stopping with exitCode 0.
24/07/26 11:13:50 INFO SparkContext: Successfully stopped SparkContext
Traceback (most recent call last):
File "C:UsersJainRonitOneDrive - STCODesktopPersonalStudyCodingPyspark2-Spark-First-ProjectHelloSpark.py", line 13, in <module>
.getOrCreate()
^^^^^^^^^^^^^
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparksqlsession.py", line 497, in getOrCreate
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 515, in getOrCreate
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 203, in __init__
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 296, in _do_init
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 421, in _initialize_context
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpy4j-0.10.9.7-src.zippy4jjava_gateway.py", line 1587, in __call__
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpy4j-0.10.9.7-src.zippy4jprotocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.Exception: spark.executor.extraJavaOptions is not allowed to set Spark options (was '-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark'). Set them directly on a SparkConf or in a properties file when using ./bin/spark-submit.
at org.apache.spark.SparkConf.$anonfun$validateSettings$4(SparkConf.scala:525)
at org.apache.spark.SparkConf.$anonfun$validateSettings$4$adapted(SparkConf.scala:521)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:521)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:410)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:834)
24/07/26 11:13:50 INFO ShutdownHookManager: Shutdown hook called
24/07/26 11:13:50 INFO ShutdownHookManager: Deleting directory C:UsersJainRonitAppDataLocalTempspark-0326d309-090a-4a5f-af13-d7fe347ab38d
</code>
<code>Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.
Install the latest PowerShell for new features and improvements! https://aka.ms/PSWindows
PS C:Sparkspark-3.5.1-bin-hadoop3> spark-submit --properties-file C:Sparkspark-3.5.1-bin-hadoop3confspark-defaults.conf 'C:UsersJainRonitOneDrive - STCODesktopPersonalStudyCodingPyspark2-Spark-First-ProjectHelloSpark.py'
24/07/26 11:13:49 INFO SparkContext: Running Spark version 3.5.1
24/07/26 11:13:49 INFO SparkContext: OS info Windows 11, 10.0, amd64
24/07/26 11:13:49 INFO SparkContext: Java version 11.0.23
24/07/26 11:13:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/07/26 11:13:50 ERROR SparkContext: Error initializing SparkContext.
java.lang.Exception: spark.executor.extraJavaOptions is not allowed to set Spark options (was '-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark'). Set them directly on a SparkConf or in a properties file when using ./bin/spark-submit.
at org.apache.spark.SparkConf.$anonfun$validateSettings$4(SparkConf.scala:525)
at org.apache.spark.SparkConf.$anonfun$validateSettings$4$adapted(SparkConf.scala:521)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:521)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:410)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:834)
24/07/26 11:13:50 INFO SparkContext: SparkContext is stopping with exitCode 0.
24/07/26 11:13:50 INFO SparkContext: Successfully stopped SparkContext
Traceback (most recent call last):
File "C:UsersJainRonitOneDrive - STCODesktopPersonalStudyCodingPyspark2-Spark-First-ProjectHelloSpark.py", line 13, in <module>
.getOrCreate()
^^^^^^^^^^^^^
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparksqlsession.py", line 497, in getOrCreate
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 515, in getOrCreate
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 203, in __init__
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 296, in _do_init
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 421, in _initialize_context
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpy4j-0.10.9.7-src.zippy4jjava_gateway.py", line 1587, in __call__
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpy4j-0.10.9.7-src.zippy4jprotocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.Exception: spark.executor.extraJavaOptions is not allowed to set Spark options (was '-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark'). Set them directly on a SparkConf or in a properties file when using ./bin/spark-submit.
at org.apache.spark.SparkConf.$anonfun$validateSettings$4(SparkConf.scala:525)
at org.apache.spark.SparkConf.$anonfun$validateSettings$4$adapted(SparkConf.scala:521)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:521)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:410)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:834)
24/07/26 11:13:50 INFO ShutdownHookManager: Shutdown hook called
24/07/26 11:13:50 INFO ShutdownHookManager: Deleting directory C:UsersJainRonitAppDataLocalTempspark-0326d309-090a-4a5f-af13-d7fe347ab38d
</code>
Windows PowerShell
Copyright (C) Microsoft Corporation. All rights reserved.
Install the latest PowerShell for new features and improvements! https://aka.ms/PSWindows
PS C:Sparkspark-3.5.1-bin-hadoop3> spark-submit --properties-file C:Sparkspark-3.5.1-bin-hadoop3confspark-defaults.conf 'C:UsersJainRonitOneDrive - STCODesktopPersonalStudyCodingPyspark2-Spark-First-ProjectHelloSpark.py'
24/07/26 11:13:49 INFO SparkContext: Running Spark version 3.5.1
24/07/26 11:13:49 INFO SparkContext: OS info Windows 11, 10.0, amd64
24/07/26 11:13:49 INFO SparkContext: Java version 11.0.23
24/07/26 11:13:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
24/07/26 11:13:50 ERROR SparkContext: Error initializing SparkContext.
java.lang.Exception: spark.executor.extraJavaOptions is not allowed to set Spark options (was '-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark'). Set them directly on a SparkConf or in a properties file when using ./bin/spark-submit.
at org.apache.spark.SparkConf.$anonfun$validateSettings$4(SparkConf.scala:525)
at org.apache.spark.SparkConf.$anonfun$validateSettings$4$adapted(SparkConf.scala:521)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:521)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:410)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:834)
24/07/26 11:13:50 INFO SparkContext: SparkContext is stopping with exitCode 0.
24/07/26 11:13:50 INFO SparkContext: Successfully stopped SparkContext
Traceback (most recent call last):
File "C:UsersJainRonitOneDrive - STCODesktopPersonalStudyCodingPyspark2-Spark-First-ProjectHelloSpark.py", line 13, in <module>
.getOrCreate()
^^^^^^^^^^^^^
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparksqlsession.py", line 497, in getOrCreate
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 515, in getOrCreate
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 203, in __init__
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 296, in _do_init
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpyspark.zippysparkcontext.py", line 421, in _initialize_context
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpy4j-0.10.9.7-src.zippy4jjava_gateway.py", line 1587, in __call__
File "C:Sparkspark-3.5.1-bin-hadoop3pythonlibpy4j-0.10.9.7-src.zippy4jprotocol.py", line 326, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.lang.Exception: spark.executor.extraJavaOptions is not allowed to set Spark options (was '-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark'). Set them directly on a SparkConf or in a properties file when using ./bin/spark-submit.
at org.apache.spark.SparkConf.$anonfun$validateSettings$4(SparkConf.scala:525)
at org.apache.spark.SparkConf.$anonfun$validateSettings$4$adapted(SparkConf.scala:521)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkConf.validateSettings(SparkConf.scala:521)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:410)
at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:247)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
at py4j.Gateway.invoke(Gateway.java:238)
at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:80)
at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:69)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
at java.base/java.lang.Thread.run(Thread.java:834)
24/07/26 11:13:50 INFO ShutdownHookManager: Shutdown hook called
24/07/26 11:13:50 INFO ShutdownHookManager: Deleting directory C:UsersJainRonitAppDataLocalTempspark-0326d309-090a-4a5f-af13-d7fe347ab38d
Spark-default.conf File
<code>spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark
</code>
<code>spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark
</code>
spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties -Dspark.yarn.app.container.log.dir=app-logs -Dlogfile.name=HelloSpark
I tried running my code by setting default things in spark-defaults.cong, But i am facing error during executing it.
New contributor
Ronit Jain is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.