amplab / docker-scripts Goto Github PK
View Code? Open in Web Editor NEWDockerfiles and scripts for Spark and Shark Docker images
Dockerfiles and scripts for Spark and Shark Docker images
Line 61 of start_spark_cluster.sh
will fail if the host machine has set http_proxy
environment variable, because the proxy server could not connect to MASTER_IP
, which is a private IP
Is there any plan / idea for when the scripts for dockering 1.1 will become available?
Hi all,
I'm trying to deploy a spark cluster 0.9.0 but the script is blocked at the stage waiting for master.
The logs from the spark-master is as follow
SPARK_HOME=/opt/spark-0.9.0
HOSTNAME=master
SCALA_VERSION=2.10.3
PATH=/opt/spark-0.9.0:/opt/scala-2.10.3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
SPARK_VERSION=0.9.0
PWD=/
JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
SHLVL=1
HOME=/
SCALA_HOME=/opt/scala-2.10.3
container=lxc
_=/usr/bin/env
MASTER_IP=172.17.0.6
preparing Spark
starting Hadoop Namenode
starting sshd
starting Spark Master
And docker ps -a
shows ๐
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8c19bdb67f2b amplab/spark-master:0.9.0 /root/spark_master_f 4 minutes ago Exit 1 boring_wright
Is the Exit 1 correct ?
I have a Vagrant setup running the docker scripts using docker 0.9. I also have a simple maven project that tries to replicate your Shell example. I keep getting failures on the submission.
Java Main is:
public class SparkMain {
protected static String master = "spark://master:7077"; // change to your master URL
protected static String sparkHome = "/opt/spark-0.9.0";
public static void main(String [] args ){
JavaSparkContext sc = new JavaSparkContext(master, "Test App",
sparkHome, JavaSparkContext.jarOfClass(SparkMain.class));
JavaRDD<String> file = sc.textFile("hdfs://master:9000/user/hdfs/test.txt");
//JavaRDD<String> file = sc.textFile("README.md");
System.out.println(file.count());
sc.stop();
}
}
When running the test with "README.md", I see an error that its cannot find "/vagrant/README.md". In that case I don't understand why Spark things that the file is relative to the vagrant vm and not the docker containers.
When I use the hdfs url, then I just get a lot of these:
14/05/09 00:05:50 INFO scheduler.DAGScheduler: Missing parents: List()
14/05/09 00:05:50 INFO scheduler.DAGScheduler: Submitting Stage 0 (MappedRDD[1] at textFile at SparkMain.java:19), which has no missing parents
14/05/09 00:05:50 INFO scheduler.DAGScheduler: Submitting 2 missing tasks from Stage 0 (MappedRDD[1] at textFile at SparkMain.java:19)
14/05/09 00:05:50 INFO scheduler.TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
14/05/09 00:05:55 INFO client.AppClient$ClientActor: Executor updated: app-20140509000548-0012/0 is now FAILED (Command exited with code 1)
14/05/09 00:05:55 INFO cluster.SparkDeploySchedulerBackend: Executor app-20140509000548-0012/0 removed: Command exited with code 1
14/05/09 00:05:55 INFO client.AppClient$ClientActor: Executor added: app-20140509000548-0012/3 on worker-20140508215925-worker3-43556 (worker3:43556) with 1 cores
14/05/09 00:05:55 INFO cluster.SparkDeploySchedulerBackend: Granted executor ID app-20140509000548-0012/3 on hostPort worker3:43556 with 1 cores, 512.0 MB RAM
I've tried several things:
The 1.0.0 boot2docker vm doesn't have bash installed! So one must use tcb-ab to install bash.
This issue has been previously raised and closed by dbaba
#39
But the suggested solution is not working for me.
Following the example explained in https://amplab.cs.berkeley.edu/2013/10/23/got-a-minute-spin-up-a-spark-cluster-on-your-laptop-with-docker/
I've tried so far,
sudo ./docker-scripts/deploy/deploy.sh -i amplab/spark:1.0.0 -c
sudo ./docker-scripts/deploy/deploy.sh -i amplab/spark:0.8.0 -c
sudo ./docker-scripts/deploy/deploy.sh -i amplab/spark:1.0.0
sudo ./docker-scripts/deploy/deploy.sh -i amplab/spark:0.8.0
sudo ./docker-scripts/deploy/deploy.sh -i amplab/spark:1.0.0 -w 3
My output is
[vagrant@docker ~]$ sudo ./docker-scripts/deploy/deploy.sh -i amplab/spark:1.0.0 -w 3
*** Starting Spark 1.0.0 ***
starting nameserver container
started nameserver container: ee09901077c4c1a61e3cdb4d79b30060de777abae4c9fd0580b2176a2aa4f58a
DNS host->IP file mapped: /tmp/dnsdir_28254/0hosts
NAMESERVER_IP: 172.17.0.20
waiting for nameserver to come up
starting master container
started master container: 723fafc1b39de3fe6a28130968b4d65e8adf78ea131176621203f362a689ab8e
MASTER_IP: 172.17.0.21
waiting for master ................
waiting for nameserver to find master
starting worker container
started worker container: 3a061e3091fe435f18673c29c586e58e024011620f7864fd55191cbad23001d5
starting worker container
started worker container: 1c2be21cc224942f4d26efacc9b0d09ed900affaee1bdafff0bf9a2ddf54b4c3
starting worker container
started worker container: 65bc96a8777e1e68dcece7bf73dfcf09b4a2d64bb2372a9ce0916ed99fc52692
waiting for workers to register .......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Is there a way to check the installation log of these containers?
I noted that in the spark-base Dockerfile, PATH is set as follows:
ENV PATH $SPARK_HOME:$SCALA_HOME/bin:$PATH
I doubt it is harmful (the file has always been like this), but seems like the /bin is missing after $SPARK_HOME
getting this msg and it runs for ever...
sudo ./deploy/deploy.sh -i amplab/spark:0.9.0
*** Starting Spark 0.9.0 ***
starting nameserver container
WARNING: WARNING: Local (127.0.0.1) DNS resolver found in resolv.conf and containers can't use it. Using default external servers : [8.8.8.8 8.8.4.4]
started nameserver container: 4a6ba6682fc59b1ea99fc82644c16fd8c6b5aeffa158b3143076e74422640564
DNS host->IP file mapped: /tmp/dnsdir_17059/0hosts
NAMESERVER_IP: 172.17.0.5
waiting for nameserver to come up ............
Am a Spark and Docker noob and this is actually a question and not an issue.
I followed your instructions and was able to setup the cluster and run the example. This is what I see as my cluster status -
vagrant@packer-virtualbox-iso:/vagrant/sparkling$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
8f5d44eefa65 amplab/spark-worker:0.9.0 /root/spark_worker_f About an hour ago Up About an hour 8888/tcp prickly_lumiere
33c48ef9d17e amplab/spark-worker:0.9.0 /root/spark_worker_f About an hour ago Up About an hour 8888/tcp stoic_feynman
d91e47ed0b90 amplab/spark-worker:0.9.0 /root/spark_worker_f About an hour ago Up About an hour 8888/tcp ecstatic_babbage
e173ecd4f4c0 amplab/spark-master:0.9.0 /root/spark_master_f About an hour ago Up About an hour 7077/tcp, 8080/tcp berserk_nobel
d67f979d70fe amplab/dnsmasq-precise:latest /root/dnsmasq_files/ About an hour ago Up About an hour
I have written a Spark program for Linear Regression which runs perfectly in the local mode. It is a very small program and on github here
Now, I want to run this program on my spark cluster. The instructions in the Spark programming guide leave me scratching my head about what to do next. Want your help to know what is the right way to run the application -
If this has been explained elsewhere then please point me as I could not find any example on how to run an application program on a spark cluster.
Thank you very much.
I am using spark 1.0.0
docker images. It appears to me that the script default_cmd
in spark-worker
is not working as it should be working. This script calls prepare_spark $1
of /root/spark_files/configure_spark.sh
. I have debugged it a lot. Even, I have called configure_spark.sh
from spark-base image by using docker run -it
.
The problem is that these script do not replace the __MASTER__
tag in core_site.xml
in /root/hadoop_files/
with the argument provided. Instead, the worker expects the master to be master
. That is, it is static.
Please, can someone help me out with this as I need it to create clusters on different machines. If I am not able to specify master like this, then I cannot create a cluster on different machines as the worker nodes will not know about the master. It works on single machine though, but that is because I have installed the docker-dns
service.
I repeatedly get the following error:
started nameserver container: 23fbb2b99f1a3de88ca310ab992f9ec93eb2fe201860509bcc98324e43532535
DNS host->IP file mapped: /tmp/dnsdir_16657/0hosts
NAMESERVER_IP:
waiting for nameserver to come up Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
dig: couldn't get address for '': not found
.Usage: grep [OPTION]... PATTERN [FILE]...
It looks like this project was started prior to the release of docker's host networking feature.
The use of dnsmasq here is for reverse DNS (since the container and host don't share the same hostname), right? Now that one can do docker run --net host -P ....
to make the container and host share the same network interface, it's no longer necessary to have a custom reverse DNS solution. This approach does work: I've gotten Spark and HDFS running in docker 1.1.2 w/out custom DNS. You might consider removing the dnsmasq dependency or perhaps clarifying to users which docker versions require custom DNS.
Thanks for posting these scripts, though! They've had significant educational value 8)
I was trying to run Spark 1.0.0 on docker with this project scripts.
After installing python and bash with tae-ab, I ran the following command on boot2docker vm:
sudo ./deploy/deploy.sh -i amplab/spark:1.0.0 -c
then the command never finished.
Did I miss something?
The terminal showed the message waiting for workers to register
and dots followed it.
The command didn't seem to get stuck as the dots were increased continuously but never finished over a couple of hours.
Here is the entire log until ctrl + c:
** Starting Spark 1.0.0 ***
starting nameserver container
Unable to find image 'amplab/dnsmasq-precise' locally
Pulling repository amplab/dnsmasq-precise
started nameserver container: 8a6c93484ff992538fb7d706cf7d348477920a90f41a6ce7068120c2afb4d04f
DNS host->IP file mapped: /tmp/dnsdir_25925/0hosts
NAMESERVER_IP: 172.17.0.2
waiting for nameserver to come up
starting master container
Unable to find image 'amplab/spark-master:1.0.0' locally
Pulling repository amplab/spark-master
started master container: 009ca3cb6a3bb034af5aaac7f7898d37d7f898d692a1e27cfc28b205982c6575
MASTER_IP: 172.17.0.3
waiting for master ............
waiting for nameserver to find master
starting worker container
Unable to find image 'amplab/spark-worker:1.0.0' locally
Pulling repository amplab/spark-worker
started worker container: ed2ddaefd11e2934cd7ac0d5717f244a64e6997d55791a2b272400642918ffe9
starting worker container
started worker container: 1ffec6ce05937bf37cb96d9ecd3f0cdfd6c66a75c098f964608af144fa3d9f3b
waiting for workers to register ................................................
................................................................................
................................................................................
................................................................................
Any suggestion is appreciated.
I tried to follow the Spark example (spark:0.8.0 image) but I get errors because no service is running on port 9000:
$ sudo docker attach 27550fe348c3410c50ff7a7a395a7444f79945fbc980dc78b401a96b75a54a3d
sudo: unable to resolve host ip-10-244-4-249
14/02/08 16:34:56 INFO ipc.Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/08 16:34:57 INFO ipc.Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/08 16:34:58 INFO ipc.Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/08 16:34:59 INFO ipc.Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/08 16:35:00 INFO ipc.Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/08 16:35:01 INFO ipc.Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
14/02/08 16:35:02 INFO ipc.Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
put: Call to master/172.17.0.3:9000 failed on connection exception: java.net.ConnectException: Connection refused
starting Spark Shell
Of course the sample will fail:
scala> val textFile = sc.textFile("hdfs://master:9000/user/hdfs/test.txt")
14/02/08 16:43:21 INFO MemoryStore: ensureFreeSpace(36192) called with curMem=0, maxMem=530593873
14/02/08 16:43:21 INFO MemoryStore: Block broadcast_0 stored as values to memory (estimated size 35.3 KB, free 506.0 MB)
textFile: org.apache.spark.rdd.RDD[String] = MappedRDD[1] at textFile at <console>:12
scala> textFile.count()
14/02/08 16:43:29 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/02/08 16:43:29 WARN LoadSnappy: Snappy native library not loaded
14/02/08 16:43:30 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 0 time(s).
14/02/08 16:43:31 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 1 time(s).
14/02/08 16:43:32 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 2 time(s).
14/02/08 16:43:33 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 3 time(s).
14/02/08 16:43:34 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 4 time(s).
14/02/08 16:43:35 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 5 time(s).
14/02/08 16:43:36 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 6 time(s).
14/02/08 16:43:37 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 7 time(s).
14/02/08 16:43:38 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 8 time(s).
14/02/08 16:43:39 INFO Client: Retrying connect to server: master/172.17.0.3:9000. Already tried 9 time(s).
java.net.ConnectException: Call to master/172.17.0.3:9000 failed on connection exception: java.net.ConnectException: Connection refused
Connecting on the master and checking the open ports, I get:
# lsof -n|grep LIST
sshd 131 root 3u IPv4 36387 0t0 TCP *:ssh (LISTEN)
sshd 131 root 4u IPv6 36389 0t0 TCP *:ssh (LISTEN)
java 172 hdfs 12u IPv6 36486 0t0 TCP 172.17.0.3:7077 (LISTEN)
java 172 hdfs 17u IPv6 36490 0t0 TCP *:http-alt (LISTEN)
Running Docker version 0.8.0, build cc3a8c8
$ docker -v
Docker version 0.7.0, build 0d078b6
$ sudo docker pull -t="0.8.0" amplab/shark-master
Pulling repository amplab/shark-master
2013/12/03 10:58:32 Server error: 404 trying to fetch remote history for amplab/shark-master
same command works well for amplab/spark-master image
While running spark cluster with docker 0.7, I am getting this error:
13/12/03 19:04:38 ERROR StandaloneExecutorBackend: error while creating actor
java.net.UnknownHostException: 1a183a2affd5: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:866)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1258)
at java.net.InetAddress.getAllByName0(InetAddress.java:1211)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at java.net.InetAddress.getAllByName(InetAddress.java:1063)
at java.net.InetAddress.getByName(InetAddress.java:1013)
at akka.remote.netty.ActiveRemoteClient$$anonfun$connect$1.apply$mcV$sp(Client.scala:170)
at akka.util.Switch.liftedTree1$1(LockUtil.scala:33)
at akka.util.Switch.transcend(LockUtil.scala:32)
at akka.util.Switch.switchOn(LockUtil.scala:55)
at akka.remote.netty.ActiveRemoteClient.connect(Client.scala:158)
at akka.remote.netty.NettyRemoteTransport.send(NettyRemoteSupport.scala:153)
at akka.remote.RemoteActorRef.$bang(RemoteActorRefProvider.scala:247)
at org.apache.spark.executor.StandaloneExecutorBackend.preStart(StandaloneExecutorBackend.scala:48)
at akka.actor.ActorCell.create$1(ActorCell.scala:508)
at akka.actor.ActorCell.systemInvoke(ActorCell.scala:600)
at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:209)
at akka.dispatch.Mailbox.run(Mailbox.scala:178)
at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516)
at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259)
at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975)
at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479)
at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Versions of Ubuntu and Docker:
Ubuntu 12.04.3 LTS , Release: 12.04, Codename: precise
Docker version 0.7.0, build 0d078b6
The docs don't give the default password or a way to set it while creating the containers.
The PySpark shell is currently not supported very well by the Python installation in the Spark base images. Although the changes are small they require rebuilding the base images. See http://www.rankfocus.com/run-berkeley-sparks-pyspark-using-docker-couple-minutes/
qianyuxiang@qianyuxiangdeMacBook-Pro:~$sudo docker-scripts/deploy/deploy.sh -i amplab/spark:1.0.0 -w 2 -c
*** Starting Spark 1.0.0 ***
starting nameserver container
time="2015-12-07T21:24:26+08:00" level=fatal msg="Post http:///var/run/docker.sock/v1.18/containers/create: dial unix /var/run/docker.sock: no such file or directory. Are you trying to connect to a TLS-enabled daemon without TLS?"
error: could not start nameserver container from image amplab/dnsmasq-precise
I just clone the repository from github and run the script.But this works on my virtual centos machine
When I follow the README instructions on OS X with boot2docker v1.0.0, I get the following:
docker@boot2docker:~$ git clone https://github.com/amplab/docker-scripts.git
Cloning into 'docker-scripts'...
remote: Reusing existing pack: 1011, done.
remote: Counting objects: 50, done.
remote: Compressing objects: 100% (39/39), done.
remote: Total 1061 (delta 18), reused 31 (delta 10)
Receiving objects: 100% (1061/1061), 144.43 KiB, done.
Resolving deltas: 100% (429/429), done.
docker@boot2docker:~$ cd docker-scripts/
docker@boot2docker:~/docker-scripts$ sudo ./deploy/deploy.sh
sudo: unable to execute ./deploy/deploy.sh: No such file or directory
This is a general question about Spark on Docker. Let me know if there is a better place to ask this. I asked a similar question on the Spark dev list.
I am having trouble building Spark and running all the unit tests within a Docker container. The JVM complains that there isn't enough memory, though I believe I've set the appropriate JAVA_OPTS
and granted the Docker container plenty of memory.
Do you folks have some instructions on how to build Spark from source and run all the unit tests within a Docker container? I took a look through the scripts here but couldn't find anything.
For the record, I'm trying to build and test Spark as follows:
# start the container like this
# docker run -m 4g -t -i centos bash
export JAVA_OPTS="-Xms512m -Xmx1024m -XX:PermSize=64m -XX:MaxPermSize=128m -Xss512k"
# build
sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-thriftserver package assembly/assembly
# Scala unit tests
sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl -Phive -Phive-thriftserver catalyst/test sql/test hive/test mllib/test
Hi!
I'm trying to start a spark cluster but I have a problem with nameserver container. The deploy script is stuck in waiting for nameserver to come up
. When I checked the nameserver container I can see the following logs:
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
dnsmasq: cannot access directory /etc/dnsmasq.d: Permission denied
The command I used was: sudo ./deploy/deploy.sh -i amplab/spark:1.0.0 -w 3
Am I forgetting some param? I tried passing -v flag and did not work...
I am using spark 1.0.0
docker images. When I start master node with hostnames other than "master" it simply fails. Moreover, the worker nodes try to contact the master node using the name master instead of the IP provided in the command line argument of docker run command. It changes the /etc/hadoop/core-site.xml
, but why does it contact the master node with the name "master". Following are the logs of master and worker respectively:
1- Master log with hostname other than master:
core@coreos-2 ~ $ docker run -itP -h master spark-master:1.0.0
core@coreos-2 ~ $ docker run -itP spark-master:1.0.0
SPARK_HOME=/opt/spark-1.0.0
HOSTNAME=ad28c0356f17
TERM=xterm
SCALA_VERSION=2.10.3
PATH=/opt/spark-1.0.0:/opt/scala-2.10.3/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
SPARK_VERSION=1.0.0
PWD=/
JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
SHLVL=1
HOME=/root
SCALA_HOME=/opt/scala-2.10.3
_=/usr/bin/env
MASTER_IP=172.17.0.2
preparing Spark
starting Hadoop Namenode
starting sshd
starting Spark Master
starting org.apache.spark.deploy.master.Master, logging to /opt/spark-1.0.0-bin-hadoop1/sbin/../logs/spark-hdfs-org.apache.spark.deploy.master.Master-1-ad28c0356f17.out
Warning: SPARK_MEM is deprecated, please use a more specific config option
(e.g., spark.executor.memory or SPARK_DRIVER_MEMORY).
Spark Command: /usr/lib/jvm/java-7-openjdk-amd64/bin/java -cp ::/opt/spark-1.0.0-bin-hadoop1/conf:/opt/spark-1.0.0-bin-hadoop1/lib/spark-assembly-1.0.0-hadoop1.0.4.jar -XX:MaxPermSize=128m -Dspark.akka.logLifecycleEvents=true -Xms800m -Xmx800m org.apache.spark.deploy.master.Master --ip master --port 7077 --webui-port 8080
========================================
14/09/30 09:19:19 INFO SecurityManager: Changing view acls to: hdfs
14/09/30 09:19:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hdfs)
14/09/30 09:19:20 INFO Slf4jLogger: Slf4jLogger started
14/09/30 09:19:20 INFO Remoting: Starting remoting
Exception in thread "main" java.net.UnknownHostException: master: Name or service not known
at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901)
at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293)
at java.net.InetAddress.getAllByName0(InetAddress.java:1246)
at java.net.InetAddress.getAllByName(InetAddress.java:1162)
at java.net.InetAddress.getAllByName(InetAddress.java:1098)
at java.net.InetAddress.getByName(InetAddress.java:1048)
at akka.remote.transport.netty.NettyTransport$$anonfun$addressToSocketAddress$1.apply(NettyTransport.scala:382)
at akka.remote.transport.netty.NettyTransport$$anonfun$addressToSocketAddress$1.apply(NettyTransport.scala:382)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:42)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2- Worker log:
core@coreos-1 ~/docker-scripts/spark-1.0.0/spark-worker $ docker run -P -h worker spark-worker:1.0.0 10.132.232.22
WORKER_IP=172.17.0.54
preparing Spark
starting Hadoop Datanode
* Starting Apache Hadoop Data Node server hadoop-datanode
starting datanode, logging to /var/log/hadoop//hadoop--datanode-worker.out
...done.
starting sshd
starting Spark Worker
Warning: SPARK_MEM is deprecated, please use a more specific config option
(e.g., spark.executor.memory or SPARK_DRIVER_MEMORY).
14/09/30 09:33:38 INFO SecurityManager: Changing view acls to: hdfs
14/09/30 09:33:38 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hdfs)
14/09/30 09:33:39 INFO Slf4jLogger: Slf4jLogger started
14/09/30 09:33:40 INFO Remoting: Starting remoting
14/09/30 09:33:40 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkWorker@worker:48571]
14/09/30 09:33:40 INFO Worker: Starting Spark worker worker:48571 with 1 cores, 1500.0 MB RAM
14/09/30 09:33:40 INFO Worker: Spark home: /opt/spark-1.0.0
14/09/30 09:33:41 INFO WorkerWebUI: Started WorkerWebUI at http://worker:8081
14/09/30 09:33:41 INFO Worker: Connecting to master spark://master:7077...
14/09/30 09:33:41 WARN Remoting: Tried to associate with unreachable remote address [akka.tcp://sparkMaster@master:7077]. Address is now gated for 60000 ms, all messages to this address will be delivered to dead letters.
14/09/30 09:33:41 INFO RemoteActorRefProvider$RemoteDeadLetterActorRef: Message [org.apache.spark.deploy.DeployMessages$RegisterWorker] from Actor[akka://sparkWorker/user/Worker#-1054615506] to Actor[akka://sparkWorker/deadLetters] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
14/09/30 09:34:01 INFO Worker: Connecting to master spark://master:7077...
14/09/30 09:34:01 INFO RemoteActorRefProvider$RemoteDeadLetterActorRef: Message [org.apache.spark.deploy.DeployMessages$RegisterWorker] from Actor[akka://sparkWorker/user/Worker#-1054615506] to Actor[akka://sparkWorker/deadLetters] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
14/09/30 09:34:21 INFO Worker: Connecting to master spark://master:7077...
14/09/30 09:34:21 INFO RemoteActorRefProvider$RemoteDeadLetterActorRef: Message [org.apache.spark.deploy.DeployMessages$RegisterWorker] from Actor[akka://sparkWorker/user/Worker#-1054615506] to Actor[akka://sparkWorker/deadLetters] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
14/09/30 09:34:41 ERROR Worker: All masters are unresponsive! Giving up.
P.S: The worker container is present on different machine (coreos-1). Therefore, it cannot connect to master, as there is no global discovery service.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.