Giter Club home page Giter Club logo

Comments (7)

tribbloid avatar tribbloid commented on June 26, 2024

Strange, this problem should be resolved:
https://issues.apache.org/jira/browse/SPARK-1199
I'll test on my laptop, in the mean time, please ensure that you Spark
binary has been upgraded to 1.1.1.
(Should be upgraded to 1.2.0 soon, after they fixed
https://issues.apache.org/jira/browse/SPARK-4923)

Yours Peng

On 12/23/2014 12:31 PM, Mathieu wrote:

I'm running into a problem while executing the standard spark/graphX
example
https://spark.apache.org/docs/1.1.1/graphx-programming-guide.html#examples
in ISpark, see this notebook
http://nbviewer.ipython.org/gist/mathieu1/4c7bf1ae84514939a83f.

Using Spark 1.1.1 with "local[2]" master and IPython Notebook 2.3, I
get the following error:

|org.apache.spark.SparkException: Job aborted due to stage failure: ClassNotFound with classloader: org.apache.spark.executor.ExecutorURLClassLoader@b6c3ef9
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1185)
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1174)
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1173)
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1173)
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:688)
scala.Option.foreach(Option.scala:236)
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:688)
org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1391)
akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
akka.actor.ActorCell.invoke(ActorCell.scala:456)
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
akka.dispatch.Mailbox.run(Mailbox.scala:219)
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
|

A similar error has already been brought up by @benjaminlaird
https://github.com/benjaminlaird also using ISpark, see his much
simpler code https://gist.github.com/benjaminlaird/3e543a9a89fb499a3a14.

I suppose this problem has to do with the |ExecutorURLClassLoader|
class being |private[spark]| (see |ExecutorURLClassLoader.scala|)

Of course, all the code runs fine on the standard |spark-shell|. The
same issue happens on the spark backend for IScala
mattpap/IScala#21 from @hvanhovell
https://github.com/hvanhovell


Reply to this email directly or view it on GitHub
#7.

from ispark.

tribbloid avatar tribbloid commented on June 26, 2024

Confirmed, this is a bug caused by bypassing some steps in SparkImport (as classloaders in executors are different from that in master).

from ispark.

tribbloid avatar tribbloid commented on June 26, 2024

Hi @mathieu1 ,

Thanks a lot for your prompt. This is getting more and more interesting. First I would like to confirm you are not running on scala_2.11? Apparently spark-repl setup a ClassLoader server in this case to synch between driver & executor (scala_2.10.4 doesn't have this feature).

The problem only happens when new class is defined in interpreter and its instances being collected from executor. It doesn't matter where the class is instantiated. There is no problem printing it locally.

Further more, its always thrown by ExecutorUrlClassLoader which is only used in SparkSubmit. So maybe there are some secret hacking in spark-shell.sh which I didn't scrutinize.

I'll get back to you once I advance, please keep me informed. Also, have you tried it on NFlab zeppelin & ibm Spark-kernel? Do they work?

from ispark.

mathieu1 avatar mathieu1 commented on June 26, 2024

Indeed I compiled and ran averything on scala 2.10.4

Regarding your last question : I tried IBM's spark-kernel and it worked :) Their dependency on an old version of zeromq (2) made it nontrivial for me to setup however.

I haven't managed to use NFlab's zeppelin at all so far.

from ispark.

tribbloid avatar tribbloid commented on June 26, 2024

aha, looks like its working on zeppelin as well, I'll likely switch my backend to them in the future.

from ispark.

tribbloid avatar tribbloid commented on June 26, 2024

I still get this error in 1.3.0, looks like there is going be some serious hacking to the class loader.

I'll simply copy @benjaminlaird's test script here, in case it got deleted by original author.

case class Circle(rad:Float)
val rdd = sc.parallelize(1 to 10000).map(i=>Circle(i.toFloat))
rdd.take(10)

from ispark.

tribbloid avatar tribbloid commented on June 26, 2024

ok problem fixed. Turns out to be easier than I think.
This issue will be closed after 3 days without objection

from ispark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.