Giter Club home page Giter Club logo

Comments (8)

skaarthik avatar skaarthik commented on May 10, 2024

PR #398

from mobius.

DaweiCai avatar DaweiCai commented on May 10, 2024

Hi skaarthik,

When I was trying to run the streaming Sample SparkClrHdfsWordCount with the following command:

.\sparkclr-submit.cmd --master local[*] --exe SparkClrHdfsWordCount.exe C:\Users\dacai\Mobius\examples\Streaming\HdfsWordCount\bin\Release C:\Users\dacai\Mobius\cp\ C:\Users\dacai\Mobius\dacaiTemp

I can't see the output format in the cmd console:
wordCounts.ForeachRDD((time, rdd) =>
{
Console.WriteLine("-------------------------------------------");
Console.WriteLine("Time: {0}", time);
Console.WriteLine("-------------------------------------------");
object[] taken = rdd.Take(10);
foreach (object record in taken)
{
Console.WriteLine(record);
}
Console.WriteLine();
});

But with the batch example: Mobius\examples\Batch\WordCount, I can see the output in the console.

Do you do why or could you show me some clue how to deep into this issue?

Thanks @skaarthik

from mobius.

hebinhuang avatar hebinhuang commented on May 10, 2024

@kasuosuo
I do see the results from Console. Please make sure all Pre-Requisites are set correctly. and the Input Directory (c:\users\dacai\Mobius\dacaiTemp) has files, the checkpoint directory (c:\users\dacai\Mobius\cp) has not been created before you start to run.
image

from mobius.

DaweiCai avatar DaweiCai commented on May 10, 2024

Thanks for your comments @hebinhuang .

Yes, I have checked :
1. c:\users\dacai\Mobius\dacaiTemp has files
2. c:\users\dacai\Mobius\cp has not been created before start the local job.
I still can not see the output, and the console stuck here๏ผš
image

For all Pre-Requisites you mentioned, do you have a list?

Thanks

from mobius.

DaweiCai avatar DaweiCai commented on May 10, 2024

And randomly, I encountered following error:
image

from mobius.

hebinhuang avatar hebinhuang commented on May 10, 2024

Pre-Requisites:
https://github.com/Microsoft/Mobius/blob/master/notes/running-mobius-app.md#wordcount-example-batch

Can you pipe out the console text ?

from mobius.

DaweiCai avatar DaweiCai commented on May 10, 2024

Sure. I downloaded the release version and test the streaming example again, also the same error.

PS C:\Windows\system32> sparkclr-submit.cmd --master local[4] --conf spark.local.dir=E:\sparktemp\wordcount\ --exe Spark
ClrHdfsWordCount.exe E:\Java\OpenSourceSoftware\spark-clr_2.10-1.6.100\examples\Streaming\HdfsWordCount E:\sparktemp\cp
C:\Users\dacai\Mobius\dacaiTemp\ >E:\sparktemp\log.txt

log4j:WARN No appenders could be found for logger (io.netty.util.internal.logging.InternalLoggerFactory).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/05/13 14:48:58 INFO SparkContext: Running Spark version 1.6.1
16/05/13 14:48:58 WARN SparkConf: In Spark 1.0 and later spark.local.dir will be overridden by the value set by the clus
ter manager (via SPARK_LOCAL_DIRS in mesos/standalone and LOCAL_DIRS in YARN).
16/05/13 14:48:58 INFO SecurityManager: Changing view acls to: dacai
16/05/13 14:48:58 INFO SecurityManager: Changing modify acls to: dacai
16/05/13 14:48:58 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view perm
issions: Set(dacai); users with modify permissions: Set(dacai)
16/05/13 14:48:59 INFO Utils: Successfully started service 'sparkDriver' on port 44636.
16/05/13 14:49:00 INFO Slf4jLogger: Slf4jLogger started
16/05/13 14:49:00 INFO Remoting: Starting remoting
16/05/13 14:49:00 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]
.17:44649]
16/05/13 14:49:00 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 44649.
16/05/13 14:49:00 INFO SparkEnv: Registering MapOutputTracker
16/05/13 14:49:00 INFO SparkEnv: Registering BlockManagerMaster
16/05/13 14:49:01 INFO DiskBlockManager: Created local directory at E:\sparktemp\wordcount\blockmgr-49d2e80b-4de8-466c-9
7d4-db7077aee6bd
16/05/13 14:49:01 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
16/05/13 14:49:01 INFO SparkEnv: Registering OutputCommitCoordinator
16/05/13 14:49:01 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/05/13 14:49:01 INFO SparkUI: Started SparkUI at http://10.168.187.17:4040
16/05/13 14:49:01 INFO HttpFileServer: HTTP File server directory is E:\sparktemp\wordcount\spark-67034994-c974-4e95-8e5
e-ce8b7a503aa8\httpd-eb5a5791-bfe5-428c-bf92-2398cd0c00a9
16/05/13 14:49:01 INFO HttpServer: Starting HTTP Server
16/05/13 14:49:01 INFO Utils: Successfully started service 'HTTP file server' on port 44656.
16/05/13 14:49:01 INFO SparkContext: Added JAR file:/E:/Java/OpenSourceSoftware/spark-clr_2.10-1.6.100/runtime/lib/spark
-clr_2.10-1.6.100.jar at http://10.168.187.17:44656/jars/spark-clr_2.10-1.6.100.jar with timestamp 1463122141515
16/05/13 14:49:01 INFO Executor: Starting executor ID driver on host localhost
16/05/13 14:49:01 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on
port 44673.
16/05/13 14:49:01 INFO NettyBlockTransferService: Server created on 44673
16/05/13 14:49:01 INFO BlockManagerMaster: Trying to register BlockManager
16/05/13 14:49:01 INFO BlockManagerMasterEndpoint: Registering block manager localhost:44673 with 511.1 MB RAM, BlockMan
agerId(driver, localhost, 44673)
16/05/13 14:49:01 INFO BlockManagerMaster: Registered BlockManager
16/05/13 14:49:02 INFO FileInputDStream: Duration for remembering RDDs set to 60000 ms for org.apache.spark.streaming.ds
tream.FileInputDStream@76bba21e
16/05/13 14:49:02 INFO ForEachDStream: metadataCleanupDelay = -1
16/05/13 14:49:02 INFO CSharpDStream: metadataCleanupDelay = -1
16/05/13 14:49:02 INFO MappedDStream: metadataCleanupDelay = -1
16/05/13 14:49:02 INFO FileInputDStream: metadataCleanupDelay = -1
16/05/13 14:49:02 INFO FileInputDStream: Slide time = 30000 ms
16/05/13 14:49:02 INFO FileInputDStream: Storage level = StorageLevel(false, false, false, false, 1)
16/05/13 14:49:02 INFO FileInputDStream: Checkpoint interval = null
16/05/13 14:49:02 INFO FileInputDStream: Remember duration = 60000 ms
16/05/13 14:49:02 INFO FileInputDStream: Initialized and validated org.apache.spark.streaming.dstream.FileInputDStream@7
6bba21e
16/05/13 14:49:02 INFO MappedDStream: Slide time = 30000 ms
16/05/13 14:49:02 INFO MappedDStream: Storage level = StorageLevel(false, false, false, false, 1)
16/05/13 14:49:02 INFO MappedDStream: Checkpoint interval = null
16/05/13 14:49:02 INFO MappedDStream: Remember duration = 30000 ms
16/05/13 14:49:02 INFO MappedDStream: Initialized and validated org.apache.spark.streaming.dstream.MappedDStream@54c59c3
8
16/05/13 14:49:02 INFO CSharpDStream: Slide time = 30000 ms
16/05/13 14:49:02 INFO CSharpDStream: Storage level = StorageLevel(false, false, false, false, 1)
16/05/13 14:49:02 INFO CSharpDStream: Checkpoint interval = null
16/05/13 14:49:02 INFO CSharpDStream: Remember duration = 30000 ms
16/05/13 14:49:02 INFO CSharpDStream: Initialized and validated org.apache.spark.streaming.api.csharp.CSharpDStream@650c
8b7c
16/05/13 14:49:02 INFO ForEachDStream: Slide time = 30000 ms
16/05/13 14:49:02 INFO ForEachDStream: Storage level = StorageLevel(false, false, false, false, 1)
16/05/13 14:49:02 INFO ForEachDStream: Checkpoint interval = null
16/05/13 14:49:02 INFO ForEachDStream: Remember duration = 30000 ms
16/05/13 14:49:02 INFO ForEachDStream: Initialized and validated org.apache.spark.streaming.dstream.ForEachDStream@74690
abf
16/05/13 14:49:03 INFO RecurringTimer: Started timer for JobGenerator at time 1463122170000
16/05/13 14:49:03 INFO JobGenerator: Started JobGenerator at 1463122170000 ms
16/05/13 14:49:03 INFO JobScheduler: Started JobScheduler
16/05/13 14:49:03 INFO StreamingContext: StreamingContext started
16/05/13 14:49:30 INFO FileInputDStream: Finding new files took 11 ms
16/05/13 14:49:30 INFO FileInputDStream: New files at time 1463122170000 ms:

CSharp transform callback failed with java.net.SocketException: Connection reset
java.net.SocketException: Connection reset
at java.net.SocketInputStream.read(SocketInputStream.java:189)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at java.net.SocketInputStream.read(SocketInputStream.java:203)
at java.io.DataInputStream.readByte(DataInputStream.java:265)
at org.apache.spark.api.csharp.SerDe$.readObjectType(SerDe.scala:20)
at org.apache.spark.api.csharp.SerDe$.readObject(SerDe.scala:24)
at org.apache.spark.streaming.api.csharp.CSharpDStream$.callCSharpTransform(CSharpDStream.scala:59)
at org.apache.spark.streaming.api.csharp.CSharpDStream.compute(CSharpDStream.scala:105)
at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.
scala:352)
at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.
scala:352)
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:351)
at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:351)
at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:426)
at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:346)
at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:344)
at scala.Option.orElse(Option.scala:257)
at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:341)
at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:47)
at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115)
at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:114)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105)
at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:114)
at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:248)
at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$3.apply(JobGenerator.scala:246)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:246)
at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processE
vent(JobGenerator.scala:181)
at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:87)
at org.apache.spark.streaming.scheduler.JobGenerator$$anon$1.onReceive(JobGenerator.scala:86)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
16/05/13 14:49:31 INFO StreamingContext: Invoking stop(stopGracefully=false) from shutdown hook
16/05/13 14:49:31 INFO JobGenerator: Stopping JobGenerator immediately
16/05/13 14:49:31 INFO RecurringTimer: Stopped timer for JobGenerator after time 1463122170000
16/05/13 14:49:31 INFO JobScheduler: No jobs added for time 1463122170000 ms
16/05/13 14:49:31 INFO CheckpointWriter: CheckpointWriter executor terminated ? true, waited for 0 ms.
16/05/13 14:49:31 INFO JobGenerator: Stopped JobGenerator
16/05/13 14:49:31 INFO JobScheduler: Stopped JobScheduler
16/05/13 14:49:31 INFO StreamingContext: StreamingContext stopped successfully
16/05/13 14:49:31 INFO SparkContext: Invoking stop() from shutdown hook
16/05/13 14:49:31 INFO SparkUI: Stopped Spark web UI at http://10.168.187.17:4040
16/05/13 14:49:31 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
16/05/13 14:49:31 INFO MemoryStore: MemoryStore cleared
16/05/13 14:49:31 INFO BlockManager: BlockManager stopped
16/05/13 14:49:31 INFO BlockManagerMaster: BlockManagerMaster stopped
16/05/13 14:49:31 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
16/05/13 14:49:31 INFO SparkContext: Successfully stopped SparkContext
16/05/13 14:49:31 INFO ShutdownHookManager: Shutdown hook called
16/05/13 14:49:31 INFO ShutdownHookManager: Deleting directory E:\sparktemp\wordcount\spark-67034994-c974-4e95-8e5e-ce8b
7a503aa8\httpd-eb5a5791-bfe5-428c-bf92-2398cd0c00a9
16/05/13 14:49:31 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
16/05/13 14:49:31 INFO ShutdownHookManager: Deleting directory E:\sparktemp\wordcount\spark-67034994-c974-4e95-8e5e-ce8b
7a503aa8
16/05/13 14:49:31 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remo
te transports.

For the E:\sparktemp\log.txt, it shows:
SPARKCLR_JAR=spark-clr_2.10-1.6.100.jar
[sparkclr-submit.cmd] Command to run --master local[4] --conf spark.local.dir=E:\sparktemp\wordcount\ --name SparkClrHdfsWordCount --class org.apache.spark.deploy.csharp.CSharpRunner E:\Java\OpenSourceSoftware\spark-clr_2.10-1.6.100\runtime\lib\spark-clr_2.10-1.6.100.jar E:\Java\OpenSourceSoftware\spark-clr_2.10-1.6.100\examples\Streaming\HdfsWordCount E:\Java\OpenSourceSoftware\spark-clr_2.10-1.6.100\examples\Streaming\HdfsWordCount\SparkClrHdfsWordCount.exe E:\sparktemp\cp\ C:\Users\dacai\Mobius\dacaiTemp
[CSharpRunner.main] Starting CSharpBackend!
[CSharpRunner.main] Port number used by CSharpBackend is 44600
[CSharpRunner.main] adding key=spark.jars and value=file:/E:/Java/OpenSourceSoftware/spark-clr_2.10-1.6.100/runtime/lib/spark-clr_2.10-1.6.100.jar to environment
[CSharpRunner.main] adding key=spark.app.name and value=SparkClrHdfsWordCount to environment
[CSharpRunner.main] adding key=spark.submit.deployMode and value=client to environment
[CSharpRunner.main] adding key=spark.master and value=local[4] to environment
[CSharpRunner.main] adding key=spark.local.dir and value=E:\sparktemp\wordcount\ to environment
[2016-05-13T06:48:54.9589864Z] [DAWEICAI-Z420] [Info] [ConfigurationService] ConfigurationService runMode is LOCAL
[2016-05-13T06:48:54.9609888Z] [DAWEICAI-Z420] [Info] [SparkCLRConfiguration] CSharpBackend successfully read from environment variable CSHARPBACKEND_PORT
[2016-05-13T06:48:54.9609888Z] [DAWEICAI-Z420] [Info] [SparkCLRIpcProxy] CSharpBackend port number to be used in JvMBridge is 44600
[2016-05-13T06:48:58.6821475Z] [DAWEICAI-Z420] [Info] [SparkConf] Spark app name set to HdfsWordCount
[2016-05-13T06:49:02.6525224Z] [DAWEICAI-Z420] [Info] [StreamingContextIpcProxy] Callback server port number is 44675
[CSharpBackendHandler] Connecting to a callback server at port 44675
[2016-05-13T06:49:30.0754966Z] [DAWEICAI-Z420] [Debug] [StreamingContextIpcProxy] New thread (id=5) created to process callback request
[2016-05-13T06:49:30.0924992Z] [DAWEICAI-Z420] [Error] [StreamingContextIpcProxy] Exception processing call back request. Thread id 5
[2016-05-13T06:49:30.0995051Z] [DAWEICAI-Z420] [Exception] [StreamingContextIpcProxy] Exception of type 'System.OutOfMemoryException' was thrown.
at Microsoft.Spark.CSharp.Interop.Ipc.SerDe.ReadBytes(Stream s, Int32 length) in c:\projects\sparkclr\csharp\Adapter\Microsoft.Spark.CSharp\Interop\Ipc\SerDe.cs:line 212
at Microsoft.Spark.CSharp.Interop.Ipc.SerDe.ReadBytes(Stream s) in c:\projects\sparkclr\csharp\Adapter\Microsoft.Spark.CSharp\Interop\Ipc\SerDe.cs:line 252
at Microsoft.Spark.CSharp.Proxy.Ipc.StreamingContextIpcProxy.ProcessCallbackRequest(Object socket) in c:\projects\sparkclr\csharp\Adapter\Microsoft.Spark.CSharp\Proxy\Ipc\StreamingContextIpcProxy.cs:line 284
[2016-05-13T06:49:30.0995051Z] [DAWEICAI-Z420] [Error] [StreamingContextIpcProxy] ProcessCallbackRequest fail, will exit ...
[CSharpRunner.main] closing CSharpBackend
Requesting to close all call back sockets.
[CSharpRunner.main] Return CSharpBackend code 1
Utils.exit() with status: 1, maxDelayMillis: 1000

from mobius.

hebinhuang avatar hebinhuang commented on May 10, 2024

@kasuosuo Please try command the below:
sparkclr-submit.cmd --master local[*] --exe SparkClrHdfsWordCount.exe E:\Java\OpenSourceSoftware\spark-clr_2.10-1.6.100\examples\Streaming\HdfsWordCount E:\sparktemp\cp C:\Users\dacai\Mobius\dacaiTemp 1> 1.txt 2>2.txt

from mobius.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.