Giter Club home page Giter Club logo

Comments (3)

skaarthik avatar skaarthik commented on May 11, 2024

You also need to set SPARK_HOME environment variable in addition to JAVA_HOME and SPARKCLR_HOME. I guess SPARK_HOME is set in your case. Otherwise, you will get the error message from https://github.com/Microsoft/SparkCLR/blob/master/scripts/sparkclr-submit.cmd#L77.

Since the error is on SparkCLRSubmitArguments class, I think this is most likely due to incorrect value for SPARKCLR_CLASSPATH environment variable. You do not have to explicitly set this environment variable as it is set by sparkclr-submit.cmd. You can simply echo the value of this environment variable to confirm if it points to SparkCLR jar file.

from mobius.

sehunley avatar sehunley commented on May 11, 2024

That did find an issue with the SPARKCLR_HOME variable, it was not set correctly, causing the issue above. However, now when I run, I am getting the following error:

You're right, I did get the error for the SPARK_HOME not being set:
[sparkclr-submit.cmd] Error - SPARK_HOME environment variable is not set
[sparkclr-submit.cmd] Note that SPARK_HOME environment variable should not have trailing \

Where would that point to? My Folder\SparkCLR-master\build\runtime\lib"? Where the spark-clr_2.10-1.6.0-SNAPSHOT.jar file resides?

Thanks.

from mobius.

sehunley avatar sehunley commented on May 11, 2024

Well, I pointed the SPARK_HOME to the C:Spark\SparkCLR-master\build\tools\spark-1.6.0-bin-hadoop2.6. That seems to have solved the issue with the SPARK_HOME and the SPARKCLR_HOME variables. However, when I tried the following command:

sparkclr-submit.cmd --verbose --master spark://spark01:7077 --exe SparkCLRSamples.exe %SPARKCLR_HOME%\samples spark.local.dir %SPARKCLR_HOME%\Temp sparkclr.sampledata.loc %SPARKCLR_HOME%\data

Basically trying to execute the samples on my Spark Cluster I get the following error:

C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\scripts>sparkclr-submit.cmd --verbose --master spark://spark01:7077 --exe SparkCLRSamples.exe %SPARKCLR_HOME%\samples spark.local.dir %SPARKCLR_HOME%\Temp sparkclr.sampledata.loc %SPARKCLR_HOME%\data
SPARKCLR_JAR=spark-clr_2.10-1.6.0-SNAPSHOT.jar
SPARKCLR_CLASSPATH=C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\lib\spark-clr_2.10-1.6.0-SNAPSHOT.jar
Zip driver directory C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples to C:\Users\shunley\AppData\Local\Temp\samples_1453846139169.zip
[sparkclr-submit.cmd] Command to run --verbose --master spark://spark01:7077 --name SparkCLRSamples --files C:\Users\shunley\AppData\Local\Temp\samples_1453846139169.zip --class org.apache.spark.deploy.csharp.CSharpRunner C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\lib\spark-clr_2.10-1.6.0-SNAPSHOT.jar C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples\SparkCLRSamples.exe spark.local.dir C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\Temp sparkclr.sampledata.loc C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\data
Using properties file: null
Parsed arguments:
master spark://spark01:7077
deployMode null
executorMemory null
executorCores null
totalExecutorCores null
propertiesFile null
driverMemory null
driverCores null
driverExtraClassPath null
driverExtraLibraryPath null
driverExtraJavaOptions null
supervise false
queue null
numExecutors null
files file:/C:/Users/shunley/AppData/Local/Temp/samples_1453846139169.zip
pyFiles null
archives null
mainClass org.apache.spark.deploy.csharp.CSharpRunner
primaryResource file:/C:/MyData/Apache_Spark/SparkCLR-master/build/runtime/lib/spark-clr_2.10-1.6.0-SNAPSHOT.jar
name SparkCLRSamples
childArgs [C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples\SparkCLRSamples.exe spark.local.dir C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\Temp sparkclr.sampledata.loc C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\data]
jars null
packages null
packagesExclusions null
repositories null
verbose true

Spark properties used, including those specified through
--conf and those from the properties file null:

Main class:
org.apache.spark.deploy.csharp.CSharpRunner
Arguments:
C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples
C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\samples\SparkCLRSamples.exe
spark.local.dir
C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\Temp
sparkclr.sampledata.loc
C:\MyData\Apache_Spark\SparkCLR-master\build\runtime\data
System properties:
SPARK_SUBMIT -> true
spark.files -> file:/C:/Users/shunley/AppData/Local/Temp/samples_1453846139169.zip
spark.app.name -> SparkCLRSamples
spark.jars -> file:/C:/MyData/Apache_Spark/SparkCLR-master/build/runtime/lib/spark-clr_2.10-1.6.0-SNAPSHOT.jar
spark.submit.deployMode -> client
spark.master -> spark://spark01:7077
Classpath elements:
file:/C:/MyData/Apache_Spark/SparkCLR-master/build/runtime/lib/spark-clr_2.10-1.6.0-SNAPSHOT.jar

[CSharpRunner.main] Starting CSharpBackend!
[CSharpRunner.main] Port number used by CSharpBackend is 1914
[CSharpRunner.main] adding key=spark.jars and value=file:/C:/MyData/Apache_Spark/SparkCLR-master/build/runtime/lib/spark-clr_2.10-1.6.0-SNAPSHOT.jar to environment
[CSharpRunner.main] adding key=spark.app.name and value=SparkCLRSamples to environment
[CSharpRunner.main] adding key=spark.files and value=file:/C:/Users/shunley/AppData/Local/Temp/samples_1453846139169.zip to environment
[CSharpRunner.main] adding key=spark.submit.deployMode and value=client to environment
[CSharpRunner.main] adding key=spark.master and value=spark://spark01:7077 to environment
[SparkCLRSamples.exe.PrintLogLocation] Logs by SparkCLR and Apache Spark are available at C:\Users\shunley\AppData\Local\Temp\SparkCLRLogs
on object of type NullObject failed
null
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.spark.api.csharp.CSharpBackendHandler.handleMethodCall(CSharpBackendHandler.scala:164)
at org.apache.spark.api.csharp.CSharpBackendHandler.channelRead0(CSharpBackendHandler.scala:94)
at org.apache.spark.api.csharp.CSharpBackendHandler.channelRead0(CSharpBackendHandler.scala:27)
at io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:244)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:137)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:873)
at org.apache.hadoop.fs.FileUtil.chmod(FileUtil.java:853)
at org.apache.spark.util.Utils$.fetchFile(Utils.scala:406)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1386)
at org.apache.spark.SparkContext.addFile(SparkContext.scala:1340)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
at scala.collection.immutable.List.foreach(List.scala:318)
at org.apache.spark.SparkContext.(SparkContext.scala:491)
... 25 more
()

It looks like a lot of variables are missing at the beginning of the submission, like DeployMode. Is there documentation on what to set and what's required?

Thanks.

from mobius.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.