Giter Club home page Giter Club logo

ibmcb / cbtool Goto Github PK

View Code? Open in Web Editor NEW
77.0 77.0 49.0 46.69 MB

Cloud Rapid Experimentation and Analysis Toolkit

License: Apache License 2.0

Python 71.76% Java 0.27% Ruby 0.24% C++ 0.27% HTML 5.93% CSS 0.03% JavaScript 0.39% Shell 18.17% Makefile 0.01% R 2.42% Batchfile 0.01% Filebench WML 0.35% Go 0.17%
azure benchmark-framework benchmarking cloud digitalocean docker ec2 gce kubernetes libvirt lxc openstack python27 stress-testing workload-generator

cbtool's People

Contributors

bowers avatar cpschult avatar jdesfossez avatar jtaleric avatar maugustosilva avatar mraygalaxy avatar mraygalaxy2 avatar pawelzell avatar pdmazz avatar psuriset avatar rayx avatar salmanbaset avatar suryaprabhakar avatar taowu-andy avatar tli16 avatar tta0114 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cbtool's Issues

unable to match netcat dependency

latest code from https://github.com/ibmcb/cbtool.git
running ~/cbtool/install (but manually it is the same) I get
(16) Checking "netcat" version by executing the command "netcat -v -w 1 localhost -z 22"...
RESULT: NOT OK.
ACTION: Please install/configure "netcat" by issuing the following command: "sudo rpm -Uvh ftp://ftp.pbone.net/mirror/ftp5.gwdg.de/pub/opensuse/repositories/home:/rusjako/Fedora_19/x86_64/netcat-openbsd-1.89-119.1.x86_64.rpm;"

Running
sudo rpm -Uvh ftp://ftp.pbone.net/mirror/ftp5.gwdg.de/pub/opensuse/repositories/home:/rusjako/Fedora_19/x86_64/netcat-openbsd-1.89-119.1.x86_64.rpm;
Retrieving ftp://ftp.pbone.net/mirror/ftp5.gwdg.de/pub/opensuse/repositories/home:/rusjako/Fedora_19/x86_64/netcat-openbsd-1.89-119.1.x86_64.rpm
warning: /var/tmp/rpm-tmp.f4IDt5: Header V3 DSA/SHA1 Signature, key ID 0ae6233b: NOKEY
error: Failed dependencies:
libc.so.6(GLIBC_2.14)(64bit) is needed by netcat-openbsd-1.89-119.1.x86_64
libc.so.6(GLIBC_2.15)(64bit) is needed by netcat-openbsd-1.89-119.1.x86_64
libc.so.6(GLIBC_2.16)(64bit) is needed by netcat-openbsd-1.89-119.1.x86_64

and this unfortunately happens both with Red Hat 6.4 and 6.5.
Please consider I'm registered to internal IBM repository and using ibm-yum.sh for installing

Jump-box FIP attach failure

Apr 19 16:32:39 host.com [2016-04-19 16:32:39,845] [WARNING] osk_cloud_ops.py/OskCmds.floating_ip_attach TEST_stack - (While getting instance(s) through API call "floating ip attach") Invalid input for field/attribute address. Value: False. False is valid under each of {'format': 'ipv6'}, {'format': 'ipv4'} (HTTP 400) (Request-ID: req-b726f240-2051-4297-ab7d-e5a907401778)

Error while executing the command line "wget -N -P ~ URL;tar -xzf hadoop*.gz;"

Just reporting for tracking this even if I have a workaround since the code is still getting it

./install --wks hadoop

(52) Checking "hadoop" version by executing the command ". .bashrc; ~/hadoop-1.2.1/bin/hadoop version | head -n 1 | cut -d ' ' -f 2"...
RESULT: NOT OK.
ACTION: Please install/configure "hadoop" by issuing the following command: "wget -N -P ~ URL;tar -xzf hadoop*.gz;"

(52) Installing "hadoop" by executing the command "wget -N -P ~ URL;tar -xzf hadoop_.gz;"...
RESULT: NOT OK. There was an error while installing "hadoop".: Error while executing the command line "wget -N -P ~ URL;tar -xzf hadoop_.gz;" (returncode = 3316) :--2014-04-07 05:29:18-- http://url/
Resolving url... failed: Name or service not known.
wget: unable to resolve host address âurlâ

tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now

Issue with novaclient when running against OpenStack Ocata

I recently ran cbtool against OpenStack Ocata and in order to get cbtool to connect to my cloud I had to downgrade novaclient to 6.0.0 version. Should the version of novaclient be pinned to avoid pulling a new version which might behave slightly different?

I will attempt to recreate the issue and post the problem I had with the newer version of novaclient pending time.

New API call, "aiprepare", to automatically build the workloads (AI types)

Main idea:
aiprepare [password]

CBTOOL would perform the following sequence of steps:

  1. boot an instance using the aforementioned image name on the new cloud (e.g., "blank centos-7").
  2. connect to the instance using the supplied username (and, if needed, the password with the "sshpass" utility)
  3. transfer (via rsync) a copy of the CB code
  4. inject as many pubkeys as needed
  5. execute the automated installer there (i.e., ~/cbtool/install -r workload --wks <ai_type>)
  6. capture the image, using the default names specified in the [VM_TEMPLATES] object

Requirements:
a) Cloud needs to support the CBTOOL "capture" API call
___a.1) Virtually all supported clouds do have a proper implementation for the aforementioned operation
b) Every workload (AI type) needs an automated installer:
___b.1) Some AI types (e.g., open_daytrader,HPCC) currently do not have an automated installer
___b.2) Some AI types, such as Coremark, might not be "automatable" (coremark requires one to login on their website)

Note: This idea is a compilation of several discussions with Joe Talerico ([email protected]), Michael Hines ([email protected]) and Salman Baset ([email protected])

issues solved for running hadoop workload

I had to do apply the following workarounds to have a vApp running Hadoop correctly running:

  1. on the master [cbuser@SC-172-18-16-22 ~]$ ~/cb_start_hadoop_cluster.sh was failing with the following

<175> - /home/cbuser/cb_start_hadoop_cluster.sh: ...Formating Namenode...
Error: JAVA_HOME is not set.
<175> - /home/cbuser/cb_start_hadoop_cluster.sh: Error when formatting namenode as user cbuser - NOK

To solve this I had to set
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.45.x86_64/jre
in ./hadoop-1.2.1/conf/hadoop-env.sh in the image

  1. There was a missing [ in file cb_config_hadoop_cluster.sh at line 110
    [cbuser@SC-172-18-16-19 ~]$ cat -n ~/cb_config_hadoop_cluster.sh | grep -i 110
    110 if [ $? -ne 0 ]]

  2. on the base hadoop image I had to do the following for the used user
    su - cbuser
    cd .ssh
    echo "StrictHostKeyChecking no" > config

Installing as root, dependencies file not found running configure

I'm trying to install cbtool on RHEL 6.4 as root. When running configure command I get the following error

[root@co098090 ~]# ~/cbtool/configure
Checking for Cloud Rapid Experimentation Analysis and Toolkit dependencies on this node.........

Error reading file "/root/cbtool/configs/templates/dependencies.txt":[Errno 2] No such file or directory: '/root/cbtool/configs/templates/dependencies.txt'

The dependencies.txt file is located in configs folder, not in configs/templates.
Moving the file into templates folder seems to bypass this problem.

Thanks

"repo" dependency check always fails

Even if I issue the command "sudo mv -f /tmp/*.repo /etc/yum.repos.d; touch /tmp/repoupdated;" each time I run configure I get this error:

(1) Checking "repo" version by executing the command "ls -la /tmp/repoupdated"...
RESULT: NOT OK.
ACTION: Please install/configure "repo" by issuing the following command: "sudo mv -f /tmp/*.repo /etc/yum.repos.d; touch /tmp/repoupdated;"
...
There are 1 dependencies missing: repo
Please add the missing dependency(ies) and re-run configure again.

AI time not reported - redis dependency

I noticed that the data load time wasn't being reported in the CloudBench GUI for a hadoop job.

The hadoop job launch script is taking time stamps from the cloudbench orchestrator node using redis, with a time command. The 'time' command wasn't implemented in redis until 2.6.0, whereas the dependency check for redis on the orchestrator node is looking for redis-ver = 2.5.0.

A possible resolution (if this is an issue) is to make the requirement 2.6.0 or later on the orchestrator node.

connect to OpenStack

I came across the problem below when I try to connect to OpenStack.
Error: API Service says:23: The "osk" cloud named "OPENSTACK" could not be attached to this experiment: VMC "RegionOne" did not pass the connection test." : OpenStack connection failure: The request you have made requires authentication. (HTTP 401) (Request-ID: req-9297799c-6402-40de-b43d-6c6acab4d053)

I am sure my username and password is correct. And I really can access my OpenStack through 10.0.0.11
My configuration file:
/# OpenStack (OSK) requires the following parameters (replace everything between <>, including the signs!)
[USER-DEFINED : CLOUDOPTION_MYOPENSTACK]
OSK_ACCESS = http://10.0.0.11:5000/v2.0/ # Address of controlled node (where nova-api runs)
OSK_CREDENTIALS = admin-admin-admin
OSK_SECURITY_GROUPS = default # Make sure that this group exists first
OSK_INITIAL_VMCS = RegionOne:sut # Change "RegionOne" accordingly
OSK_LOGIN = cbuser # The username that logins on the VMs
OSK_NETNAME = WS

cassandra_ycsb AI result reporting is incorrect

It appears lazy_collection in scripts/cassandra_ycsb/cb_ycsb_common.sh may have a couple problems.

  1. lazy_collection (the default collection mechanism) differs from eager_collection in that it uses different units for latency measurements. I'm having trouble capturing the output from the YCSB run command to verify the measurement units but I assume they are the same as used in the generation phase ( which are microseconds for average measurements ).
...
[OVERALL], RunTime(ms), 646267.0
[OVERALL], Throughput(ops/sec), 1547.3480774973812
[INSERT], Operations, 1000000
**[INSERT], AverageLatency(us), 641.728206**
[INSERT], MinLatency(us), 252
[INSERT], MaxLatency(us), 129799
[INSERT], 95thPercentileLatency(ms), 0
[INSERT], 99thPercentileLatency(ms), 1
[INSERT], Return=0, 1000000
...

I think the fix is pretty simple. Something like this:

$diff a/scripts/cassandra_ycsb/cb_ycsb_common.sh b/scripts/cassandra_ycsb/cb_ycsb_common.sh
-    latency:$(expr $latency):ms \
+    latency:$(expr $latency):us \
  1. But there appears to be another problem. It looks like lazy_collection looks for READ and UPDATE measurements in the output file. (this seems OK for YCSB! workload 'workloadd' ) As written lazy_collection only reports one of the values and doesn't provide any indication which value is returned. Should it be written similar to eager_collection where it reports both read and update values back to cb orchestrator?

I haven't tested the proposed changed below. It seems there could be farther reaching consequences as 'latency' is no longer returned in the result set. I don't know if that is a problem.

something like:

diff --git a/scripts/cassandra_ycsb/cb_ycsb_common.sh b/scripts/cassandra_ycsb/cb_ycsb_common.sh
index 00db685..46eeb5f 100755
--- a/scripts/cassandra_ycsb/cb_ycsb_common.sh
+++ b/scripts/cassandra_ycsb/cb_ycsb_common.sh


@@ -170,20 +174,20 @@ function lazy_collection {
         then
             if [[ ${array[1]} == *AverageLatency* ]]
             then
-                latency=${array[2]}
+                update_avg_latency=${array[2]}
             fi
         fi
         if [[ ${array[0]} == *READ* ]]
         then
             if [[ ${array[1]} == *AverageLatency* ]]
             then
-                latency=${array[2]}
+                read_avg_latency=${array[2]}
             fi
         fi

@@ -221,7 +225,8 @@ function lazy_collection {
     load_profile:${LOAD_PROFILE}:name \
     load_duration:${LOAD_DURATION}:sec \
     throughput:$(expr $ops):tps \
-    latency:$(expr $latency):us \
+    update_avg_latency:$(expr $update_avg_latency):us \
+    read_avg_latency:$(expr $read_avg_latency):us \
     datagen_time:${datagentime}:sec \
     datagen_size:${datagensize}:records \
     ${SLA_RUNTIME_TARGETS}

Simulated cloud issue with open_daytrader

~/cbtool/cb --hard_reset
aiattach ibm_daytrader works fine in the simulated cloud named MYSIMCLOUD but when I try open_daytrader I get
(MYSIMCLOUD) aiattach open_daytrader
status: Waiting for vm_4 (cloud-assigned uuid C8C5B19A-0210-5555-A0ED-54F91C38D906) to start...
status: Trying to establish network connectivity to vm_4 (cloud-assigned uuid C8C5B19A-0210-5555-A0ED-54F91C38D906), on IP address 214.32.106.21...
status: Bypassing the sending of a copy of the code tree to vm_4 (214.32.106.21)...
status: Sending a termination request for vm_4 (cloud-assigned uuid C8C5B19A-0210-5555-A0ED-54F91C38D906)....
AI object 605C9152-5FCF-560B-AEF5-F6AE1E5EB672 (named "ai_3") could not be attached to this experiment: AI pre-attachment operations failure: Parallel object operation failure: VM object D45263AA-A87A-588B-BD5B-0B822BD67C71 (named "vm_6") could not be attached to this experiment: VM pre-attachment operations failure: Unknown VM role: 'mysql' A rollback might be needed (only for VMs).
(MYSIMCLOUD)

Cassandra nodes can't join cluster concurrently

I have started testing with 6 node cassandra clusters and have hit the problem described below. Typically 1 or 2 nodes in the AI fail to join the cassandra cluster because of the error.

I am trying to add a delay in the initialization on each 'cassandra' node but don't have code that fixes the problem yet. If I find a solution I will share it.

The cassandra change is fairly new and appears to have been integrated to cassandra in Sep. 2014 [1].

Error Information logged to /var/log/cassandra/system.log on deployed VM:

ERROR [main] 2015-01-16 14:19:36,301 CassandraDaemon.java:465 - Exception encountered during startup
java.lang.UnsupportedOperationException: Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement is true
        at org.apache.cassandra.service.StorageService.joinTokenRing(StorageService.java:797) ~[apache-cassandra-2.1.2.jar:2.1.2]
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:652) ~[apache-cassandra-2.1.2.jar:2.1.2]
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:535) ~[apache-cassandra-2.1.2.jar:2.1.2]
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:324) [apache-cassandra-2.1.2.jar:2.1.2]
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:448) [apache-cassandra-2.1.2.jar:2.1.2]
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:537) [apache-cassandra-2.1.2.jar:2.1.2]

Cisco uses cassandra in one of its products and provides this guidance. They recommend two MINUTES between adding nodes.
http://www.cisco.com/c/en/us/td/docs/video/videoscape/cloud_object_store/2-1-1/release-notes/COS_RelNotes_2_1_1.html

CSCur99109  Bootstrapping issue when starting up the Cassandra service.
    When two or more nodes are added to a cluster in quick succession, an issue may arise in which the Cassandra service fails to initialize on the new node, preventing it from being added to the Cassandra cluster.
    This failure is indicated by the following error message in the Cassandra event log (/var/log/cassandra/cassandra.log):
    Exception encountered during startup: Other bootstrapping/leaving/moving nodes detected, cannot bootstrap while cassandra.consistent.rangemovement is true. 
      
    The root cause of this issue is that the node is being added while the Cassandra cluster topology is still in flux, so that initialization of the new node cannot proceed.
    To avoid this possible issue, be sure to provide (or have a script provide) a time delay of at least two minutes between adding two nodes to the cluster in sequence.

[1] http://mail-archives.apache.org/mod_mbox/cassandra-commits/201409.mbox/%[email protected]%3E

cbtool support for cassandra 2.2.x?

Is there a plan in place for cbtool to support cassandra 2.2.x? In my initial testing with cassandra 2.2.1 I've found two problems that prevent cbtool from interacting with cassandra 2.2.x.

  1. cassandra-cli was deprecated in cassandra 2.1 and removed in cassandra 2.2. ( contrary to their guidance which claimed it would not be removed until 3.0 [1] ) The alternative is to use cqlsh which appears to be a drop-in replacement for cassandra-cli. I'll try a test soon to confirm.

  2. Cassandra 2.2.x appears to disable remote JMX by default. cbtool currently relies on remote JMX access to determine the cluster status. I've changed the configuration in cassandra-env.sh to allow remote JMX and disable authentication to work around this problem. cbtool may need to perform this configuration automatically

[1] http://wiki.apache.org/cassandra/CassandraCli

Key `app_datagen_size` missing during beginning of AI

During the very beginning of a SPEC run, the first call to get_latest_app_data returns the following set of keys:

[u'app_completion_time', u'app_errors', u'expid', u'time_cbtool', u'app_iterations', u'app_load_duration', u'app_load_level', u'app_load_profile', u'time_cbtool_h', u'app_quiescent_time', u'time', u'_id', u'time_h', u'app_datagen_time', u'app_load_id', u'uuid']

You can see that the key is missing, using the latest master. Something changed in recent times in a way that didn't happen before.

Run the CloudBench orchestrator inside of the cloud

In order to run the cbtool, we need to have access to both the management network of the cloud and VMs inside the cloud.
I installed the cbtool on a physical server that have access to management network of OpenStack. But I don't know how to connect to the VMs inside the cloud.
I find a solution: Option 0: Sidestep the problem by running the CB Orchestrator directly in the Cloud.
My Question: If we run the CB Orchestrator directly in the Cloud, how can we access the management network of OpenStack?

configure after install dependency issue

If after ~/cbtool/install I run ~/cbtool/configure I get
There are 1 dependencies missing: softlayer
that was not appearing at the end of the command before that is automatically installing all dependency, so there is some issue with this command when run consequently

monextract for vm_runtime_os data fails on key time_cbtool and time_cbtool_t

I was not able to extract VM_RUNTIME_OS data from VMs after setting mon_defaults.collect_from_guest to True.

Running monextract manually returned an error:
(PVC2A) monextract PVC2A VM os
Monitor extraction failure: 'time_cbtool'

It appears the default vm_rumtime_os_metrics_header value includes 'time_cbtool' and 'time_cbtool_h'. Neither of those values are present in the runtime_os data. ( confirmed by looking at data in mongo )

I removed those two metrics from the mon_default value for key vm_rumtime_os_metrics and successfully extract the VM OS data (CPU,MEM, etc.).

I can see where 'time_cbtool' is generated when load managers report their application results. ( scripts/common/cb_common.py.report_app_metrics() ) But I don't see a routine to report OS metrics in that file.

Recommended fixes:
-- Remove time_cbtool and time_cbtool_h from the list of default columns to extract for vm_runtime_os data. (Looks like they may have been added through the use of TSTAMP value in REPORTED_RUNTIME_OS_VM_METRIC_NAMES. configs/templates/internal_options.txt )
-- Add time_cbtool and time_cbtool_h to vm_rumtime_os data generation. ( in gmetad.py? )

Floating IPs

Need to integrate Floating IPs into the osk client. OpenStack Neutron does not currently have Auto Assign Floating IPs, so we need to add to the cbtool client a method of allocating a floating ip assigning the floating ip, and disassociating the floating ip.

Documentation Issues

It seems that some commands are still undocumented:

(CSFSM2PVC) help airun
Traceback (most recent call last):
File "./cb", line 143, in
main()
File "./cb", line 137, in main
_cmd_processor.cmdloop()
File "/usr/lib64/python2.6/cmd.py", line 142, in cmdloop
stop = self.onecmd(line)
File "/usr/lib64/python2.6/cmd.py", line 219, in onecmd
return func(arg)
File "/root/cloudbench/lib/auxiliary/cli.py", line 1011, in do_help
Cmd.do_help(self, args)
File "/usr/lib64/python2.6/cmd.py", line 313, in do_help
func()
TypeError: help() takes exactly 1 argument (0 given)

Failed dependency - sshd

I get this dependency failure when installing CloudBench:

(52) Checking "sshd" version by executing the command "sudo cat /etc/ssh/sshd_config | grep -v ^# | grep UseDNS | grep no"...
RESULT: NOT OK.
ACTION: Please install/configure "sshd" by issuing the following command: "sed -i 's/.UseDNS./UseDNS no/g' /etc/ssh/sshd_config; sed -i 's/.GSSAPIAuthentication./GSSAPIAuthentication no/g' /etc/ssh/sshd_config;"

Making the above changes to /etc/ssh/sshd_config does not resolve the dependency check.

This appears to be because the PUBLIC_dependencies.txt file is missing a version for the sshd_config check.

The dependency check will pass if you add into PUBLIC_dependencies.txt
sshd-ver = ANY
right before
### END - Dependency versions ###

aiattach daytader failure in 'Running the tool for the first time' exercise

Running the tool for the 1st time in the emulated cloud I'm not able to deploy Daytrader virtual application:

(MYSIMCLOUD) aiattach daytrader
AI object 110FEBFC-6498-5EF1-A603-177EB7E84B90 (named "ai_1") could not be attached to this experiment: AI pre-attachment operations failure: VM list creation failure: 'sut'

Hadoop one is instead successfully deployed.

Thanks

vmattach failure with "unknown VM role" error message

I have an image defined in Openstack (CbtoolTemplate) that can be deployed without errors from Openstack (nova boot).
When I try to deploy it from cbtool I get this error:

(MYOPENSTACK) vmattach CLEANVM
VM object 3EF8EC13-58E1-5509-BFCE-9813C849297E (named "vm_1") could not be attached to this experiment: VM pre-attachment operations failure: Unknown VM role: 'CLEANVM'

I have tried several combinations for the definition of the role for this VM, modifying the following files:
/home/cbuser/cbtool/configs/cbuser_cloud_definitions.txt
/home/cbuser/cbtool/configs/templates/_openstack.txt

Even the attempt of changing one of the default roles (MONGODB) fails with the same message:

(MYOPENSTACK) vmattach MONGODB
VM object 9DAC3CA0-C657-5C8E-BCEF-791607FBF8C0 (named "vm_1") could not be attached to this experiment: VM pre-attachment operations failure: Unknown VM role: 'MONGODB'

Here is some examples of the changes I made to the two config files trying to define a role for the VM:

in /home/cbuser/cbtool/configs/templates/_openstack.txt

MONGODB = size:m1.medium, imageid1:CbtoolTemplate

MONGODB = size:m1.medium, imageid1:mongo_instance

CLEANVM = size:m1.medium, imageid1:CbtoolTemplate

in /home/cbuser/cbtool/configs/cbuser_cloud_definitions.txt

[VM_TEMPLATES : OSK_CLOUDCONFIG]

CLEANVM = size:m1.medium, imageid1:CbtoolTemplate

MONGODB = size:m1.medium, imageid1:CbtoolTemplate

image name prefix causes issues in attaches

Assume the following two image names loaded in openstack cloud:

cb_speccloud_cassandra_2111_centos
cb_speccloud_cassandra_2111

The image prefix causes issues. For instance, when the later image is selected, centos image still gets picked. Need to fix how image names are handled when they share a common prefix.

load_level for AI cassandra_ycsb not loaded from virtual_application.txt

Changing the value of LOAD_LEVEL ( which maps to YCSB -threads parameter) in scripts/cassandra_ycsb/virtual_application.txt does not change the number of threads used by YCSB.

Details

This can be seen on the deployed ycsb VM:

cbuser@cs-root-csfsm2pvc-vm25-ycsb:~$ grep LOAD_LEVEL virtual_application.txt
LOAD_LEVEL = 4
DESCRIPTION +=- LOAD_LEVEL meaning: number of threads on YCSB (parameter -threads).\n
cbuser@cs-root-csfsm2pvc-vm25-ycsb:~$ source cb_common.sh
cbuser@cs-root-csfsm2pvc-vm25-ycsb:~$ get_my_ai_attribute load_level
1
cbuser@cs-root-csfsm2pvc-vm25-ycsb:~$ 

Recreate Steps:

  1. modify LOAD_LEVEL in scripts/cassandra_ycsb/virtual_application.txt on cbtool orchestrator node
  2. deploy cassandra_ycsb AI.
  3. SSH to ycsb node and repeats commands shown above.

Other Information:

  • Guest VMs are Ubuntu ppc64
  • ycsb, seed, and cassandra roles are satisfied by the same OpenStack glance image

EC2 volume deletion

When running with EBS-backed instances on EC2, when I detach a VM I get an error related to deleting the volume:

ec2_cloud_ops.py/Ec2Cmds.vmdestroy TEST_user - VM XXX could not be destroyed on Elastic Compute Cloud "EC2CLOUD" : Volume previously attached to
the vm_3 (cloud-assigned uuid i-xxxxxxxx) could not be destroyed on Elastic Compute Cloud "EC2CLOUD" : local variable '_volume' referenced before assignment

ec2_cloud_ops.py code includes a comment that implies there's a known issue around deleting volumes.

I'm working around the issue by commenting out the _volume.delete() line in vvdestroy(), and by commenting out the line unattachedvol.delete() in vmdestroy().

VM OS resource usage collection

I was not able to extract VM OS metrics from VMs after setting [MON_DEFAULTS] COLLECT_FROM_GUEST = True.
The proble like this:
(MYOPENSTACK) monextract VM os
status: No samples of runtime_os metrics for all VMs were found (the file /home/ubuntu/cbtool/lib/auxiliary//../../data/EXP-09-20-2017-01-22-42-AM-UTC/VM_runtime_os_EXP-09-20-2017-01-22-42-AM-UTC.csv will be empty).
Monitor extraction success. VM runtime OS performance data samples were written to the file /home/ubuntu/cbtool/lib/auxiliary//../../data/EXP-09-20-2017-01-22-42-AM-UTC/VM_runtime_os_EXP-09-20-2017-01-22-42-AM-UTC.csv.
Data is available at the url "rsync://192.168.100.6:873/ubuntu_cb/data/EXP-09-20-2017-01-22-42-AM-UTC".

How to enable the VM OS resource usage collection function?
Thank you!!!!

Preparing Workload Images

Sorry to bother you!
I have tested CBTOOL against a "Simulated Cloud".
Workload images are needed before test my OpenStack.
Is there any convenient way to get workload images?
If preparing images by myself, is there any special requirement about the version such as ycsb.
Hoping for your reply!!

install error

After installing,the next step is "Running the tool for the first time".However,I get the error as follow:

 ./cb --soft_reset
Cbtool version is "e8db5c7"
Parsing "cloud definitions" file..... "/home/murong/cbtool/lib/auxiliary//../..//configs/root_cloud_definitions.txt" opened and parsed successfully.

Checking "Object Store".....An Object Store of the kind "Redis" (shared) on node 127.0.0.1, TCP port 6379, database id "0" seems to be running.
Checking "Log Store"..... Error while executing the command line "sudo ps aux | tr '[:upper:]' '[:lower:]' | grep "rsyslogd -f /home/murong/cbtool/stores/root_rsyslog.conf -i /home/murong/cbtool/stores/rsyslog.pid" | grep -i "" |  grep "root" | grep -v grep" (returncode = 22503) :su: invalid option -- 'f'
Usage: su [options] [LOGIN]

Options:
  -c, --command COMMAND         pass COMMAND to the invoked shell
  -h, --help                    display this help message and exit
  -, -l, --login                make the shell a login shell
  -m, -p,
  --preserve-environment        do not reset environment variables, and
                                keep the same shell
  -s, --shell SHELL             use SHELL instead of the default in passwd


9

I need your help.Please help me,thank you my friend.

Both 'app_datagen_size' and `app_datagen_time` are negative

This is also very easy to reproduce. Simply create a hadoop AI, wait for a few results to be reported.

You can graph the value like in the GUI. You'll see that the 1st value is normal, then the remaining values are all negative.

Using the commit from #151 , we get:

Example:

negative

negative2

Issues encountered using Cloudbench with ubuntu ppc64el

Below are listed the changes I made to get cloudbench/cbtool to install/run correctly with ubuntu 14.10 on ppc64el.

I'm using an IBM JVM with cassandra 2.1.2.
I was installing cassandra_ycsb and the null workload.
$ ./install --role workload --wks nullworkload,cassandra_ycsb
#1

The command to change options UseDNS and GSSAPIAuthentication should be prefixed with 'sudo'.

Error text from install:

(52) Checking "sshd" version by executing the command "sudo cat /etc/ssh/sshd_config | grep -v ^# | grep UseDNS | grep no"...
RESULT:  NOT OK.
ACTION:  Please install/configure "sshd" by issuing the following command: "sed -i 's/.*UseDNS.*/UseDNS no/g' /etc/ssh/sshd_config; sed -i 's/.*GSSAPIAuthentication.*/GSSAPIAuthentication no/g' /etc/ssh/sshd_config;"

(52) Installing "sshd" by executing the command "sed -i 's/.*UseDNS.*/UseDNS no/g' /etc/ssh/sshd_config; sed -i 's/.*GSSAPIAuthentication.*/GSSAPIAuthentication no/g' /etc/ssh/sshd_config;"...
RESULT: NOT OK. There was an error while installing "sshd".: Error while executing the command line "sed -i 's/.*UseDNS.*/UseDNS no/g' /etc/ssh/sshd_config; sed -i 's/.*GSSAPIAuthentication.*/GSSAPIAuthentication no/g' /etc/ssh/sshd_config;" (returncode = 12990) :sed: couldn't open temporary file /etc/ssh/sedTyg5Ug: Permission denied
sed: couldn't open temporary file /etc/ssh/sedSIE0Pg: Permission denied

#2

The default ubuntu sshd_config does not contain options 'UseDNS' or 'GSSAPIAuthentication'. The cloudbench installer fails the sshd dependency from PUBLIC_dependencies. I'm not sure the best approach to fix this problem. Perhaps check if the option is present ('grep -c ^UseDNS /etc/ssh/sshd_config') and append to the end of the file if not.
#3

Comment out 'chef-client-order' from PUBLIC_dependencies.txt because chef client is not available for linux ppc64el.
#4

The softlayer vpn is only available as an amd64 package. I've commented softlayer-vpn-order from IBM_dependencies.txt as I don't need it. Also, should the VPN client be required on the 'workload' system? Shouldn't that only be required on the CB orchestrator system? ( Since the workload already resides on the private network? Anyway, not an issue if the host is x86_64. )
#5

ubuntu initially failed to install the 'repo' requirement.

(1) Installing "repo" by executing the command "sudo mv -f /tmp/*.list /etc/apt/sources.list.d/; sudo apt-get update; touch /tmp/repoupdated; source /home/cbuser/cloudbench/scripts//common/cb_bootstrap.sh; service_stop_disable iptables; service_stop_disable ipfw;"...
RESULT: NOT OK. There was an error while installing "repo".: Error while executing the command line "sudo mv -f /tmp/*.list /etc/apt/sources.list.d/; sudo apt-get update; touch /tmp/repoupdated; source /home/cbuser/cloudbench/scripts//common/cb_bootstrap.sh; service_stop_disable iptables; service_stop_disable ipfw;" (returncode = 3904) :mv: cannot stat ‘/tmp/*.list’: No such file or directory
/bin/sh: 1: source: not found
/bin/sh: 1: service_stop_disable: not found
/bin/sh: 1: service_stop_disable: not found

Reading python documentation suggests /bin/sh is the default executable unless another is specified. Apparently /bin/sh on ubuntu (ie. /bin/dash) doesn't provide support for sourcing files. ( http://stackoverflow.com/questions/13702425/source-command-not-found-in-sh-shell )

I was able to specify "/bin/bash" on Popen. (https://docs.python.org/2/library/subprocess.html)
With this change in place the repo requirement is completed without issue.

$ diff -C 3 lib/remote/process_management.py.orig lib/remote/process_management.py
*** lib/remote/process_management.py.orig       2014-12-01 11:47:47.930023003 -0600
--- lib/remote/process_management.py    2014-12-01 11:45:41.654023009 -0600
***************
*** 115,121 ****
          if str(really_execute).lower() == "true" :
              _msg = "running os command: " + _cmd
              cbdebug(_msg);
!             _proc_h = Popen(_cmd, shell=True, stdout=PIPE, stderr=PIPE)

              if _proc_h.pid :
                  if not cmdline.count("--debug_host=localhost") :
--- 115,121 ----
          if str(really_execute).lower() == "true" :
              _msg = "running os command: " + _cmd
              cbdebug(_msg);
!             _proc_h = Popen(_cmd, executable="/bin/bash", shell=True, stdout=PIPE, stderr=PIPE)

              if _proc_h.pid :
                  if not cmdline.count("--debug_host=localhost") :

#6

-- I downloaded a cassandra 2.1.2 package from http://debian.datastax.com/community/pool/ and http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/. Both of these packages put cassandra configuration files in /etc/cassandra/ instead of /etc/cassandra/conf. This may be a 'feature' of the debian packages.

I hardcoded my copy of cbtool to use this path for cassandra configuration files. It should be possible to perform this test automatically (perhaps in cb_ycsb_common.sh? ) though perhaps it would be better as a configuration parameter? I think this configuration path is unique for ubuntu systems.

Set the configuration path in cb_ycsb_common.sh

diff --git a/scripts/cassandra_ycsb/cb_ycsb_common.sh b/scripts/cassandra_ycsb/cb_ycsb_common.sh
index 00db685..084d836 100755
--- a/scripts/cassandra_ycsb/cb_ycsb_common.sh
+++ b/scripts/cassandra_ycsb/cb_ycsb_common.sh
@@ -54,6 +54,7 @@ BACKEND_TYPE=$(get_my_ai_attribute type | sed 's/_ycsb//g')

 if [[ $BACKEND_TYPE == "cassandra" ]]
 then
+    export CASSANDRA_CONFIG_DIR=/etc/cassandra/
     CASSANDRA_DATA_DIR=$(get_my_ai_attribute_with_default cassandra_data_dir /dbstore)
     eval CASSANDRA_DATA_DIR=${CASSANDRA_DATA_DIR}

Changes were made each location where cassandra.yaml is referenced in the following files:
scripts/cassandra_ycsb/cb_modify_node.sh
scripts/cassandra_ycsb/cb_restart_node.sh
scripts/cassandra_ycsb/cb_restart_seed.sh

Issues encountered using Cloudbench with cassandra_ycsb

Below are listed the changes I made to get cbool to deploy AI cassandra_ycsb successfully with ubuntu ppc64el guests.

I'm using an IBM JVM with cassandra 2.1.2.
#1

-- I changed the default cassandra IP configuration applied by cbtool. By default cbtool was setting rpc_address in cassandra.yaml to 0.0.0.0. That is a problem for broadcast_rpc_address because it cannot be 0.0.0.0. Instead I've modified scripts/cassandra_ycsb/cb_modify_node.sh to set listen_address and rpc_address to empty strings. broadcast_rpc_address is left commented ( and is then set to the same value as rpc_address ). This has worked well during my testing.

diff --git a/scripts/cassandra_ycsb/cb_modify_node.sh b/scripts/cassandra_ycsb/cb_modify_node.sh
index 00e6d3e..d30cad7 100755
--- a/scripts/cassandra_ycsb/cb_modify_node.sh
+++ b/scripts/cassandra_ycsb/cb_modify_node.sh
@@ -62,18 +62,18 @@ then
 #
 # Update Cassandra Config
 #
-    sudo sed -i "s/initial_token:$/initial_token: ${my_token//[[:blank:]]/}/g" /etc/cassandra/conf/cassandra.yaml
-    sudo sed -i "s/- seeds:.*$/- seeds: $seeds_ips_csv/g" /etc/cassandra/conf/cassandra.yaml
-    sudo sed -i "s/listen_address:.*$/listen_address: ${MY_IP}/g" /etc/cassandra/conf/cassandra.yaml
-    sudo sed -i 's/rpc_address:.*$/rpc_address: 0\.0\.0\.0/g' /etc/cassandra/conf/cassandra.yaml
+    sudo sed -i "s/initial_token:$/initial_token: ${my_token//[[:blank:]]/}/g" $CASSANDRA_CONFIG_DIR/cassandra.yaml
+    sudo sed -i "s/- seeds:.*$/- seeds: $seeds_ips_csv/g" $CASSANDRA_CONFIG_DIR/cassandra.yaml
+    sudo sed -i "s/listen_address:.*$/listen_address: ${MY_IP}/g" $CASSANDRA_CONFIG_DIR/cassandra.yaml
+    sudo sed -i 's/rpc_address:.*$/rpc_address: 0\.0\.0\.0/g' $CASSANDRA_CONFIG_DIR/cassandra.yaml

That change allows cassandra to select addresses based on getLocalHost. This is OK because cloudbench does a good job of setting the VM's hostname and populating the local host table.

Comments from cassandra.yaml that confirm the broadcast_rpc_address is set from rpc_address. Also, rpc_address and listen_address are set to the value returned from getLocalHost().

# Leaving [listen_address] blank leaves it up to InetAddress.getLocalHost(). This
# will always do the Right Thing _if_ the node is properly configured
# (hostname, name resolution, etc), and the Right Thing is to use the
# address associated with the hostname (it might not be).
...
# Leaving rpc_address blank has the same effect as on listen_address
# (i.e. it will be based on the configured hostname of the node).
...
# RPC address to broadcast to drivers and other Cassandra nodes. This cannot
# be set to 0.0.0.0. If left blank, this will be set to the value of
# rpc_address. If rpc_address is set to 0.0.0.0, broadcast_rpc_address must
# be set.
# broadcast_rpc_address: 1.2.3.4

#2

-- I also changed the cassandra node configuration applied by cbtool. Instead of manually generating tokens I enabled Murmur3 partitioner and virtual node support. This precludes the need to specify 'initial_token' and 'partitioner' as the default values work well. Virtual node support is enabled by leaving initial_token unconfigured and leaving num_tokens to its default value ( 256 ).

This change removes the requirement for cassandra-tools package to generate cassandra tokens during node startup for parameter initial_token. ( At least I found that extra package was required on ubuntu to generate initial_token values in cb_ycsb_common.sh with command 'token-generator'. ) The code that generates tokens ('token-generator') for the cassandra nodes could also be removed from cb_ycsb_common.sh.
#3

-- I downloaded a cassandra 2.1.2 package from http://debian.datastax.com/community/pool/ and http://www.apache.org/dist/cassandra/debian/pool/main/c/cassandra/. Both of these packages put cassandra configuration files in /etc/cassandra/ instead of /etc/cassandra/conf. This may be a 'feature' of the debian packages.

I hardcoded my copy of cbtool to use this path for cassandra configuration files. It should be possible to perform this test automatically (perhaps in cb_ycsb_common.sh? ) though perhaps it would be better as a configuration parameter? I think this configuration path is unique for ubuntu systems.

Set the configuration path in cb_ycsb_common.sh

diff --git a/scripts/cassandra_ycsb/cb_ycsb_common.sh b/scripts/cassandra_ycsb/cb_ycsb_common.sh
index 00db685..084d836 100755
--- a/scripts/cassandra_ycsb/cb_ycsb_common.sh
+++ b/scripts/cassandra_ycsb/cb_ycsb_common.sh
@@ -54,6 +54,7 @@ BACKEND_TYPE=$(get_my_ai_attribute type | sed 's/_ycsb//g')

 if [[ $BACKEND_TYPE == "cassandra" ]]
 then
+    export CASSANDRA_CONFIG_DIR=/etc/cassandra/
     CASSANDRA_DATA_DIR=$(get_my_ai_attribute_with_default cassandra_data_dir /dbstore)
     eval CASSANDRA_DATA_DIR=${CASSANDRA_DATA_DIR}

Changes were made each location where cassandra.yaml is referenced in the following files:
scripts/cassandra_ycsb/cb_modify_node.sh
scripts/cassandra_ycsb/cb_restart_node.sh
scripts/cassandra_ycsb/cb_restart_seed.sh
#4

I copied the block device formatting code from cb_restart_seed.sh to cb_restart_node.sh. It looked like it was an oversight instead of an intentional decision.

diff --git a/scripts/cassandra_ycsb/cb_restart_node.sh b/scripts/cassandra_ycsb/cb_restart_node.sh
index c7734de..1fa4f55 100755
--- a/scripts/cassandra_ycsb/cb_restart_node.sh
+++ b/scripts/cassandra_ycsb/cb_restart_node.sh
@@ -24,6 +24,27 @@ START=`provision_application_start`

 SHORT_HOSTNAME=$(uname -n| cut -d "." -f 1)

+CINDER=true
+#
+# Check if CBTool attached the Block storage volume.
+# !! Script assumes /dev/vdb !!
+#
+# Use /dev/sda for PowerVC. ( boot device is /dev/vda )
+ATTACHED_DEV_NAME=/dev/sda
+sudo mkfs.ext4 ${ATTACHED_DEV_NAME}
+if [ $? -ne 0 ] ; then
+  syslog_netcat "Cinder did not attach the volume, or the guest does not see it."
+  CINDER=false
+fi
+
+sudo mkdir -p ${CASSANDRA_DATA_DIR}
+
+if $CINDER ; then
+  sudo mount ${ATTACHED_DEV_NAME} ${CASSANDRA_DATA_DIR}
+fi

#5

PowerVC (based on openstack) attaches volumes to the guest with iSCSI through the host. The block device is a software scsi device and receives device name 'sda' instead of 'vdb' as cloudbench expects. Perhaps some smart code could be added to find a block device without a filesystem that cbtool can use?
I hardcoded the device string to the value that worked for my environment for now.

VM OS metrics does not seem collected in default configuration

Hello, it seems OS metrics are not collected for a vm deployed. Below the steps I executed:
./cb --hard_reset

monlist VM (does not report anything)

vmattach idlevm (this deploy a nullworkload VM from an image we prepared)

(MYOPENSTACK) monlist VM
The following VMs reported management metrics:
Name |Age (seconds) |Experiment id |Number of samples
vm_1 |238 |EXP-03-07-2014-06-09-14-AM-EST |1

The following VMs reported runtime (OS) metrics:
Name |Age (seconds) |Experiment id |Number of samples

The following VMs reported runtime (Application) metrics:
Name |Age (seconds) |Experiment id |Number of samples

The issue here is that OS metrics seems are not reported. Then I run

cldalter vm_defaults run_generic_scripts=False

vmattach idlevm (again)

(MYOPENSTACK) monlist VM
The following VMs reported management metrics:
Name |Age (seconds) |Experiment id Number of samples
vm_1 |744 |EXP-03-07-2014-06-09-14-AM-EST1
vm_2 |388 |EXP-03-07-2014-06-09-14-AM-EST1

The following VMs reported runtime (OS) metrics:
Name |Age (seconds) |Experiment id Number of samples

The following VMs reported runtime (Application) metrics:
Name |Age (seconds) |Experiment id Number of samples

Remotely run ./cb_post_boot.sh

(MYOPENSTACK) monlist VM
The following VMs reported management metrics:
Name |Age (seconds) |Experiment id |Number of samples
vm_1 |852 |EXP-03-07-2014-06-09-14-AM-EST |1
vm_2 |496 |EXP-03-07-2014-06-09-14-AM-EST |1

The following VMs reported runtime (OS) metrics:
Name |Age (seconds) |Experiment id |Number of samples
vm_2 |4 |EXP-03-07-2014-06-09-14-AM-EST |1

The following VMs reported runtime (Application) metrics:
Name |Age (seconds) |Experiment id |Number of samples

So for this VM after deploying and remotely running script cb_post_boot.sh OS metrics are collected

I have available logs for the orchestrator and the two deployed vms.

Thanks

get_latest_management_data api returns pymongo cursor instead of dictionary

SPEC Cloud baseline scripts use 'get_latest_management_data' to retrieve deploy time measurements from each VM in an AI. Recently the API has been returning a pymongo.cursor.Cursor instead of a dictionary containing the expected data. It looks like the default value of variables 'samples' is forcing this behavior in 'lib/api/api_service_client.py'.

Is it expected the cursor be returned? The baseline scripts currently expect to receive a dictionary and not the cursor.

It is relatively easy to code around this. The following seems to work well enough:

                    _mgt_metric_list = self.api.get_latest_management_data(self.cloud_name, uuid)
                    # _mgt_metric_list is returned as a Cursor instead of a list. WHY?
                    import pymongo
                    if isinstance(_mgt_metric_list,pymongo.cursor.Cursor):
                        if _mgt_metric_list.count() >= 1:
                            _mgt_metric_list = _mgt_metric_list[0]

generated gmond configuration file incorrect

In my (limited) testing I have found cbtool only reports VM utilization data for one VM in each AI. I've been testing with AI cassandra_ycsb with 1 ycsb, 1 seed and 1 cassandra node. Only the ycsb node was reporting CPU usage, etc, on the CB GUI.

Upon further inspection, there are appear to be two problems in cb_create_gmond_config_file.sh:

  1. Retrieving the COLLECTOR_UNICAST_IP should not specify ${my_ai_uuid} as a paramter to get_my_ai_attribute_with_default on line 26.
    For example:
cbuser@cb-root-csfsm2pvc-vm27-cassandra:~$ get_my_ai_attribute_with_default ${my_ai_uuid} metric_aggregator_ip none
metric_aggregator_ip
cbuser@cb-root-csfsm2pvc-vm27-cassandra:~$ get_my_ai_attribute_with_default metric_aggregator_ip none
192.168.148.43
cbuser@cb-root-csfsm2pvc-vm27-cassandra:~$

Removing ${my_ai_uuid} allows the script to pick up the real metric_aggregator_ip.

  1. If I understand correctly one VM in an AI should run gmetad to collect gmond data from the other VMs in that deployed AI.
    In my testing the ycsb node was starting gmetad and gmond. ( good )
    The seed and cassandra nodes only started gmond. ( again, I think that is correct)

The problem on the cassandra and seed node is in the generated gmond configuration (gmond-vms.conf); COLLECTOR_UNICAST_IP is set to an IP address local to the VM ( instead of the ycsb node where gmetad is running).

I think this section of code at the top of cb_create_gmond_config_file.sh is the culprit. This code will always set the COLLECTOR_UNICAST_IP to the my_ip_addr. ( Except on the ycsb node where grep in the conditional returns 1 and my_ip_addr is already equal to metric_aggregator_ip ( and COLLECT_UNICAST_IP ) )

if [[ $(sudo ifconfig -a | grep -c $COLLECTOR_UNICAST_IP) -eq 0 ]]
then
       COLLECTOR_UNICAST_IP=${my_ip_addr}
fi

I commented out the conditional block and now the seed and cassandra generated gmond configurations files that point at gmetad on the ycsb node. Ultimately the gmond from all AI VMs is reported back to cbtool (and the CB GUI ).

vCloud Director (VCD) and mongo password authentication

When using the vCloud Director driver, cb attempts to connect to the mongo metrics datastore using the password used to connect to the vCloud Director API.

I think this is due to poor naming for the VMC_DEFAULTS and VM_DEFAULTS parameters in the vCloud Director template (configs/templates/_vcd.txt). Both sections have a parameter named simply PASSWORD. The same parameter name is used as the optional password parameter in stores.txt. The default value for PASSWORD in stores.txt ("$False") is likely being replaced by the value in _vcd.txt.

A work around until I fix the VCD code is to set a password for the mongo user to the same value as your VCD API password.

problem with python-novaclient (4.0.0)

cb --hard_reset end up with the error " 'VolumeManager' object has no attribute 'list' "
novaclient v2 provides a different implementation of class VolumeManager in volumes.py compared to novaclientv1.1 and does not have the list function.

Scaling UP Error

I run the cbtool orchestrator node on AMAZON EC2. I am facing issue in scaling up the hadoop workload with airesize command. The issue persists with load profile of kmeans and terasort. I can scale down successfully. The error details are below:

(MYAMAZON) airesize MYAMAZON ai_1 hadoopslave +1
ubuntu@ip-172-31-54-218: $ tail -f /var/log/cloudbench/root operations.log Jan 23 09:30:58 ip-172-31-54-218.ec2.internal
[2017-01-23 09:30:58,652] [DEBUG] process management.py/ProcessManagement.retriable run os command pro-
cess management - Command " /cb restart hadoop cluster.sh" failed to execute on hostname 172.31.57.69 after at-
tempt 1. Will try 2 more times. Jan 23 09:31:03 ip-172-31-54-218.ec2.internal [2017-01-23 09:31:03,840] [DEBUG]
process management.py/ProcessManagement.run os command process management - Error while executing the command
line " /cb restart hadoop cluster.sh" (returncode = 3700) :Warning: Permanently added '172.31.57.69' (ECDSA) to the list
of known hosts. bash: /home/cbuser/cb restart hadoop cluster.sh: No such le or directory Jan 23 09:31:03 ip-172-31-54-
218.ec2.internal [2017-01-23 09:31:03,840] [DEBUG] process management.py/ProcessManagement.retriable run os command
process management - Command " /cb restart hadoop cluster.sh" failed to execute on hostname 172.31.57.69 after attempt
2. Will try 1 more times.

clddetach hanging in 'running the tool for the 1 st time' exercise

I have recently installed and then tried the https://github.com/ibmcb/cbtool/wiki/HOWTO:-Running-the-tool-for-the-first-time. I followed the sequence of commands reported here
~/cbtool/cb --hard_reset
(MYSIMCLOUD) cldlist
(MYSIMCLOUD) vmclist
(MYSIMCLOUD) hostlist
(MYSIMCLOUD) vmattach db2
(MYSIMCLOUD) vmattach was
(MYSIMCLOUD) aiattach daytrader
(MYSIMCLOUD) aiattach hadoop
(MYSIMCLOUD) ailist
(MYSIMCLOUD) vmlist
(MYSIMCLOUD) clddetach
status: Waiting for all active AIDRS daemons to finish gracefully....
status: All AIDRS (daemons and objects were removed).
status: Removing all VMCRS objects attached to this experiment.
status: Removing all FIRS objects attached to this experiment.
status: Removing all AI objects attached to this experiment.
status: Sending a termination request for vm_3 (cloud-assigned uuid 693DA49E-4A85-56C5-B253-2C6FA0315155)....
status: Sending a termination request for vm_5 (cloud-assigned uuid 80D2D8AF-65A1-5469-9BF3-7458AEAAC861)....
status: Sending a termination request for vm_4 (cloud-assigned uuid 6E5BBC95-2265-5428-B814-E607C632F704)....
status: Sending a termination request for vm_8 (cloud-assigned uuid 66CA23D8-74B6-5EDA-A1D5-38D1ECE64ADD)....
status: Sending a termination request for vm_6 (cloud-assigned uuid 78F009D4-1361-5B75-AC25-11A6E5BFA40F)....
status: Sending a termination request for vm_9 (cloud-assigned uuid 4E9ACAD0-6D67-5F09-967D-81949AFE784B)....
status: Sending a termination request for vm_7 (cloud-assigned uuid 7BE425DE-3145-5A4E-973D-7F4858CE9EF8)....
^C status: Signal children to abort...
^C status: Done
status: Removing all VM objects attached to this experiment.
status: Sending a termination request for vm_1 (cloud-assigned uuid 99CE3336-FE43-582E-B548-2A67DF88B558)....
status: Sending a termination request for vm_2 (cloud-assigned uuid 78E3DE4E-C51B-5B22-BAB4-CA28B3F53ED1)....
status: Done
status: Removing all VMC objects attached to this experiment.
status: VMC 5995943D-688A-5BF9-9A08-E86E2D0E734B was successfully unregistered on SimCloud "MYSIMCLOUD".
status: VMC 911A154C-C493-517B-92D2-35AEBABDAEC3 was successfully unregistered on SimCloud "MYSIMCLOUD".
status: VMC F5B71C2F-E71D-5CAE-A7B7-3C92572B520F was successfully unregistered on SimCloud "MYSIMCLOUD".
status: VMC 7CC9B526-5A97-5B49-925C-3F79414483F9 was successfully unregistered on SimCloud "MYSIMCLOUD".
status: Done
status: Removing all contents from Object Store (GLOBAL objects,VIEWS, etc.)
Disassociating default cloud: MYSIMCLOUD
Cloud MYSIMCLOUD was successfully detached from this experiment.
() exit

As you can see when trying to detach the cloud with clddetach the prompt was hanging and only running two times ^C I have been able to let it close.
A similar sequence where I have instead attached only daytrader virtual application has been correctly detached.

Thanks

Floating IPs not deleted after restart of cbtool with soft_reset

I have run speccloud using cbtool with floating ip addresses on OpenStack Ocata and found that my floating ip subnet was exhausted since after each run I would use ./cb --soft_reset to clean up the instances, however the floating ips remained. It would be nice if the floating ip addresses were deleted when the instances are deleted.

hadoop automatic installer failing

When installing hadoop there is still a dependency installation failure as reported in the message below. That is just related to hadoop and I workarounded through the commands I found in the /home/cbuser/cbtool/scripts/hadoop/dependencies.txt:

  1. wget -N -P ~ http://www.carfab.com/apachesoftware/hadoop/common/hadoop-1.2.1/hadoop-1.2.1.tar.gz
  2. tar -xzf hadoop*.gz

Output message for hadoop automatic installer:

git clone https://github.com/ibmcb/cbtool.git
cd cbtool
./install --wks hadoop
...
(52) Checking "hadoop" version by executing the command ". .bashrc; ~/hadoop-1.2.1/bin/hadoop version | head -n 1 | cut -d ' ' -f 2"...
RESULT: NOT OK.
ACTION: Please install/configure "hadoop" by issuing the following command: "wget -N -P ~ URL;tar -xzf hadoop*.gz;"

(52) Installing "hadoop" by executing the command "wget -N -P ~ URL;tar -xzf hadoop*.gz;"...
RESULT: NOT OK. There was an error while installing "hadoop".

(53) Checking "hibench" version by executing the command "ls -la ~/HiBench"...
RESULT: NOT OK.
ACTION: Please install/configure "hibench" by issuing the following command: "cd ~; git clone https://github.com/ibmcb/HiBench.git; cd ~/HiBench; git checkout dev;"

(53) Installing "hibench" by executing the command "cd ~; git clone https://github.com/ibmcb/HiBench.git; cd ~/HiBench; git checkout dev;"...

RESULT: DONE OK.

There are 1 dependencies missing: hadoop

I'm workarounding this

occasional exc for ssh and rsync after VM deploying

Sometimes the VM deploying can take the following exception indipendently from the fact the VM is deployed alone or in a virtual application:

status: Trying to establish network connectivity to vm_3 (cloud-assigned uuid 8644ff02-3f5e-419e-a48f-67b8668dce78), on IP address 172.18.16.16...
status: Trying to establish network connectivity to vm_1 (cloud-assigned uuid dabbbcff-1f10-44e6-868f-a3a8998da1c6), on IP address 172.18.16.17...
status: Trying to establish network connectivity to vm_2 (cloud-assigned uuid 4ad0054f-921b-406f-8e1a-0a179abfa84b), on IP address 172.18.16.18...
status: Trying to establish network connectivity to vm_4 (cloud-assigned uuid 8b3e1a34-8430-481d-bdc4-86ae24b54168), on IP address 172.18.16.19...
status: Sending a copy of the code tree to vm_1 (172.18.16.17)...
status: Sending a termination request for Instance "vm_1" (cloud-assigned uuid dabbbcff-1f10-44e6-868f-a3a8998da1c6)....
status: Sending a copy of the code tree to vm_4 (172.18.16.19)...
status: Sending a copy of the code tree to vm_3 (172.18.16.16)...
status: Sending a copy of the code tree to vm_2 (172.18.16.18)...
status: Sending a termination request for Instance "vm_2" (cloud-assigned uuid 4ad0054f-921b-406f-8e1a-0a179abfa84b)....
status: Sending a termination request for Instance "vm_3" (cloud-assigned uuid 8644ff02-3f5e-419e-a48f-67b8668dce78)....
status: Sending a termination request for Instance "vm_4" (cloud-assigned uuid 8b3e1a34-8430-481d-bdc4-86ae24b54168)....
AI object A9CEAE4C-2BC8-5F6F-AE1B-735AC607471A (named "ai_1") could not be attached to this experiment: AI pre-attachment operations failure: Parallel object operation failure: VM object 97D32262-EB76-52D3-AD28-95A8F92A40CF (named "vm_1") could not be attached to this experiment: VM post-attachment operations failure: Error while executing the command line "ssh -i /home/cbuser/cbtool/lib/auxiliary//../../credentials/id_rsa -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected] "mkdir -p ~/cbtool;echo '#OSKN-redis' > ~/cb_os_parameters.txt;echo '#OSHN-172.18.144.101' >> ~/cb_os_parameters.txt;echo '#OSPN-6379' >> ~/cb_os_parameters.txt;echo '#OSDN-0' >> ~/cb_os_parameters.txt;echo '#OSTO-240' >> ~/cb_os_parameters.txt;echo '#OSCN-MYOPENSTACK' >> ~/cb_os_parameters.txt;echo '#OSMO-controllable' >> ~/cb_os_parameters.txt;echo '#OSOI-TEST_cbuser:MYOPENSTACK' >> ~/cb_os_parameters.txt;echo '#VMUUID-97D32262-EB76-52D3-AD28-95A8F92A40CF' >> ~/cb_os_parameters.txt;sudo chown -R cbuser /cbtool";rsync -e "ssh -o StrictHostKeyChecking=no -l cbuser -i /home/cbuser/cbtool/lib/auxiliary//../../credentials/id_rsa" --exclude-from '/home/cbuser/cbtool/lib/auxiliary//../../exclude_list.txt' -az --delete --no-o --no-g --inplace -O /home/cbuser/cbtool/lib/auxiliary//../../* 172.18.16.17:/cbtool/" (returncode = 27228) :ssh: connect to host 172.18.16.17 port 22: Connection refused
ssh: connect to host 172.18.16.17 port 22: Connection refused
rsync: connection unexpectedly closed (0 bytes received so far) [sender]
rsync error: unexplained error (code 255) at io.c(600) [sender=3.0.6]
A rollback might be needed (only for VMs).

OpenStack dependency problem during installation on RHEL

I am trying to perform the initial installation as described here (https://github.com/ibmcb/cbtool/wiki/HOWTO:-Initial-Installation), but the installation of OpenStack in Step 3 fails. When I try to run the setup.py, I get the following output:

$ sudo python ~/cbtool//3rd_party/python-novaclient/setup.py install
Traceback (most recent call last):
File "/home/idcuser/cbtool//3rd_party/python-novaclient/setup.py", line 23, in
d2to1=True
File "/usr/lib64/python2.6/distutils/core.py", line 113, in setup
_setup_distribution = dist = klass(attrs)
File "/usr/lib/python2.6/site-packages/setuptools/dist.py", line 225, in init
_Distribution.init(self,attrs)
File "/usr/lib64/python2.6/distutils/dist.py", line 270, in init
self.finalize_options()
File "/usr/lib/python2.6/site-packages/setuptools/dist.py", line 258, in finalize_options
ep.load()(self, ep.name, value)
File "/home/idcuser/cbtool/3rd_party/python-novaclient/d2to1-0.2.10-py2.6.egg/d2to1/core.py", line 42, in d2to1
'The setup.cfg file %s does not exist.' % path)
distutils.errors.DistutilsFileError: The setup.cfg file /home/idcuser/setup.cfg does not exist.

I use Red Hat Enterprise Linux 6.3 (64-bit) in an IBM SmartCloud Enterpise environment.

Can not create jumphost

I found two issues when attempting to run speccloud's cbtool in order to create jumphost into my cloud

OpenStack status: Checking if a "Jump Host" (speccloud-cb-jumphost) VM is already present on VMC regionOne....
                   Creating a "Jump Host" (speccloud-cb-jumphost) VM on  VMC regionOne, connected to the networks "browbeat_private", and attaching a floating IP from pool "browbeat_public".
 status: vm_0 (cloud-assigned uuid NA) could not be created on OpenStack Cloud ""'NoneType' object has no attribute 'replace'.

and

 OpenStack status: Checking if a "Jump Host" (speccloud-cb-jumphost) VM is already present on VMC regionOne....
                   Creating a "Jump Host" (speccloud-cb-jumphost) VM on  VMC regionOne, connected to the networks "browbeat_private", and attaching a floating IP from pool "browbeat_public".
 status: Starting an instance on OpenStack, using the imageid "cb_nullworkload" (<Image: cb_nullworkload> ) and size "m1.small" (<Flavor: m1.small>), connected to networks "browbeat_private", on VMC "regionOne", un
der tenant "default" (ssh key is "speccloud_default_cbtool_rsa" and userdata is "none")
 status: Attempting to add a floating IP to vm_0...
 status: vm_0 (cloud-assigned uuid f5594473-f135-4a4e-a927-d0f4500d28e0) could not be created on OpenStack Cloud "" (Host "overcloud-compute-4.localdomain"): 'log_string'.
 (The VM creation will be rolled back)
 status: VMC "regionOne" did not pass the connection test." : Check the previous errors, fix it (using OpenStack's web GUI (horizon) or nova CLI

I was able to resolve this with this patch:

--- lib/clouds/osk_cloud_ops.py 2016-09-20 16:32:19.000000000 -0400
+++ /home/speccloud/osgcloud/cbtool/lib/clouds/osk_cloud_ops.py 2017-10-04 11:24:10.443724927 -0400
@@ -2493,7 +2493,7 @@
         '''
         TBD
         '''
-        
+        obj_attr_list["log_string"] = ""
         try :
 
             # Too many problems with neutronclient. Failures, API calls hanging, etc.
@@ -2601,7 +2601,8 @@
 
             _scheduler_hints = None
 
-            if "userdata" in obj_attr_list and str(obj_attr_list["userdata"]).lower() != "false" :
+            if "userdata" in obj_attr_list and obj_attr_list["userdata"] is not None and str(obj_attr_list["userdata"]).lower() != "false" :
+            #if "userdata" in obj_attr_list and str(obj_attr_list["userdata"]).lower() != "false" :
                 _userdata = obj_attr_list["userdata"].replace("# INSERT OPENVPN COMMAND", \
                                                               "openvpn --config /etc/openvpn/" + obj_attr_list["cloud_name"].upper() + "_client-cb-openvpn.conf --daemon --client")
                 _config_drive = True

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.