Comments (14)
Did the initial VM creation fail? It might be the case that the recreate code assumes we have successfully created the VM, as it wants to get the stored apply_spec, but that only gets stored when the VM has been successfully created.
from bosh.
Yes and no. The VM is created on openstack ok, and probably boots up and all that. What happened is that, after creating x number of VMs, the x+1th VM is put on a compute node that is different than the compute node that the director VM is on. For some reason the network is set up such that VMs on different compute nodes can't ping each other. Thus the VM itself may have been created ok, but I get the error "Timed out pinging to b5d59cdc-a4e0-417d-977d-d5d1ca967cc8 after 600 seconds (00:10:50)" when doing a "bosh deploy" Make sense?
from bosh.
I also receive this error. The thing is I only have one compute node. Also the pinging means sending a ping request to the agent on the new stemcell which should respond with a pong.
I have tried the investigate further. First I had to get into the freshly created stemcell:
SSH into unresponsive stemcell
Login via console
vcap/c1oudc0w
su -
c1oudc0w
Install nano
apt-get install nano
Enable ssh password login
nano /etc/ssh/sshd-config
ChallengeResponseAuthentication yes
/etc/init.d/ssh restart
Login to stemcell
bosh-bootstrap ssh
ssh vcap@{privat_ip}
c1oudc0w
su -
c1oudc0w
Then I found the following messages in the logs /var/vcap/bosh/log
:
INFO: got user_data: {"registry"=>{"endpoint"=>"http://10.200.7.4:25777"}, "server"=>{"name"=>"vm-be441e38-442c-4b9c-a46e-a2ffd4f8a841"}, "dns"=>{"nameserver"=>["10.200.7.4", "10.200.7.1"]}}
2013-03-15_09:02:21.03547 #[26846] INFO: failed to load infrastructure settings: Cannot read settings for `http://10.200.7.4:25777/servers/vm-be441e38-442c-4b9c-a46e-a2ffd4f8a841/settings' from registry, got HTTP 500
When visiting http://10.200.7.4:25777/servers/vm-be441e38-442c-4b9c-a46e-a2ffd4f8a841/settings
using sshuttle I got the following:
from bosh.
Here the stacktrace from the openstack_registry on the micro bosh node:
NoMethodError - undefined method `body' for #<Hash:0x00000002306310>:
/var/vcap/packages/openstack_registry/gem_home/gems/fog-1.9.0/lib/fog/openstack/compute.rb:338:in `rescue in request'
/var/vcap/packages/openstack_registry/gem_home/gems/fog-1.9.0/lib/fog/openstack/compute.rb:326:in `request'
/var/vcap/packages/openstack_registry/gem_home/gems/fog-1.9.0/lib/fog/openstack/requests/compute/list_servers_detail.rb:15:in `list_servers_detail'
/var/vcap/packages/openstack_registry/gem_home/gems/fog-1.9.0/lib/fog/openstack/models/compute/servers.rb:21:in `all'
/var/vcap/packages/openstack_registry/gem_home/gems/fog-1.9.0/lib/fog/core/collection.rb:141:in `lazy_load'
/var/vcap/packages/openstack_registry/gem_home/gems/fog-1.9.0/lib/fog/core/collection.rb:22:in `each'
/var/vcap/packages/openstack_registry/gem_home/gems/bosh_openstack_registry-1.5.0.pre2/lib/openstack_registry/server_manager.rb:67:in `find'
/var/vcap/packages/openstack_registry/gem_home/gems/bosh_openstack_registry-1.5.0.pre2/lib/openstack_registry/server_manager.rb:67:in `server_ips'
/var/vcap/packages/openstack_registry/gem_home/gems/bosh_openstack_registry-1.5.0.pre2/lib/openstack_registry/server_manager.rb:47:in `check_instance_ips'
/var/vcap/packages/openstack_registry/gem_home/gems/bosh_openstack_registry-1.5.0.pre2/lib/openstack_registry/server_manager.rb:34:in `read_settings'
/var/vcap/packages/openstack_registry/gem_home/gems/bosh_openstack_registry-1.5.0.pre2/lib/openstack_registry/api_controller.rb:22:in `block in <class:ApiController>'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:1175:in `call'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:1175:in `block in compile!'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:739:in `instance_eval'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:739:in `route_eval'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:723:in `block (2 levels) in route!'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:773:in `block in process_route'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:770:in `catch'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:770:in `process_route'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:722:in `block in route!'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:721:in `each'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:721:in `route!'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:857:in `dispatch!'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:659:in `block in call!'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:823:in `block in invoke'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:823:in `catch'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:823:in `invoke'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:659:in `call!'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/base.rb:644:in `call'
/var/vcap/packages/openstack_registry/gem_home/gems/rack-1.5.2/lib/rack/head.rb:11:in `call'
/var/vcap/packages/openstack_registry/gem_home/gems/sinatra-1.2.8/lib/sinatra/showexceptions.rb:21:in `call'
/var/vcap/packages/openstack_registry/gem_home/gems/rack-1.5.2/lib/rack/builder.rb:138:in `call'
/var/vcap/packages/openstack_registry/gem_home/gems/rack-1.5.2/lib/rack/urlmap.rb:65:in `block in call'
/var/vcap/packages/openstack_registry/gem_home/gems/rack-1.5.2/lib/rack/urlmap.rb:50:in `each'
/var/vcap/packages/openstack_registry/gem_home/gems/rack-1.5.2/lib/rack/urlmap.rb:50:in `call'
/var/vcap/packages/openstack_registry/gem_home/gems/thin-1.5.0/lib/thin/connection.rb:81:in `block in pre_process'
/var/vcap/packages/openstack_registry/gem_home/gems/thin-1.5.0/lib/thin/connection.rb:79:in `catch'
/var/vcap/packages/openstack_registry/gem_home/gems/thin-1.5.0/lib/thin/connection.rb:79:in `pre_process'
/var/vcap/packages/openstack_registry/gem_home/gems/thin-1.5.0/lib/thin/connection.rb:54:in `process'
/var/vcap/packages/openstack_registry/gem_home/gems/thin-1.5.0/lib/thin/connection.rb:39:in `receive_data'
/var/vcap/packages/openstack_registry/gem_home/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run_machine'
/var/vcap/packages/openstack_registry/gem_home/gems/eventmachine-0.12.10/lib/eventmachine.rb:256:in `run'
/var/vcap/packages/openstack_registry/gem_home/gems/bosh_openstack_registry-1.5.0.pre2/lib/openstack_registry/runner.rb:23:in `run'
/var/vcap/packages/openstack_registry/gem_home/gems/bosh_openstack_registry-1.5.0.pre2/bin/openstack_registry:28:in `<top (required)>'
/var/vcap/packages/openstack_registry/bin/openstack_registry:23:in `load'
/var/vcap/packages/openstack_registry/bin/openstack_registry:23:in `<main>'
from bosh.
As can been seen in the stacktrace I'm using micro bosh and stemcell with version 1.5.0.pre2
from bosh.
@rkoster What version of OpenStack are you using? Seems OS is not returning a servers hash when fog calls list_server_details. Can you please use fog to connect to your OS environment and execute a simple openstack.servers?
from bosh.
I have just redeployed the microbosh with the --update flag (same stemcell). And now the problem has gone away. But I have had this problem before so I think it will resurface again. I looks like the problem occurs over time.
from bosh.
As extreme step..I could delete unresponsive agents which were not getting removed through cloudcheck by deleting deployment using "--force"
from bosh.
Extreme but common solution to deleting deployments currently :)
Dr Nic Williams
Stark & Wayne LLC - the consultancy for Cloud Foundry
http://starkandwayne.com
+1 415 860 2185
twitter: drnic
On Tue, Mar 19, 2013 at 4:50 AM, Pradeep Tummala [email protected]
wrote:
As extreme step..I could delete unresponsive agents which were not getting removed through cloudcheck by deleting deployment using "--force"
Reply to this email directly or view it on GitHub:
#62 (comment)
from bosh.
I encountered this problem again. I'm running OpenStack version essex
with micro bosh version 1.5.0.pre2 (release:346bb97d bosh:346bb97d)
NoMethodError - undefined method `body' for #<Hash:0x000000042f8f38>:
/var/vcap/packages/registry/gem_home/gems/fog-1.9.0/lib/fog/openstack/compute.rb:338:in `rescue in request'
/var/vcap/packages/registry/gem_home/gems/fog-1.9.0/lib/fog/openstack/compute.rb:326:in `request'
/var/vcap/packages/registry/gem_home/gems/fog-1.9.0/lib/fog/openstack/requests/compute/list_servers_detail.rb:15:in `list_servers_detail'
/var/vcap/packages/registry/gem_home/gems/fog-1.9.0/lib/fog/openstack/models/compute/servers.rb:21:in `all'
/var/vcap/packages/registry/gem_home/gems/fog-1.9.0/lib/fog/core/collection.rb:141:in `lazy_load'
/var/vcap/packages/registry/gem_home/gems/fog-1.9.0/lib/fog/core/collection.rb:22:in `each'
/var/vcap/packages/registry/gem_home/gems/bosh_registry-1.5.0.pre2/lib/bosh_registry/instance_manager/openstack.rb:39:in `find'
/var/vcap/packages/registry/gem_home/gems/bosh_registry-1.5.0.pre2/lib/bosh_registry/instance_manager/openstack.rb:39:in `instance_ips'
/var/vcap/packages/registry/gem_home/gems/bosh_registry-1.5.0.pre2/lib/bosh_registry/instance_manager.rb:45:in `check_instance_ips'
/var/vcap/packages/registry/gem_home/gems/bosh_registry-1.5.0.pre2/lib/bosh_registry/instance_manager.rb:29:in `read_settings'
/var/vcap/packages/registry/gem_home/gems/bosh_registry-1.5.0.pre2/lib/bosh_registry/api_controller.rb:22:in `block in
To debug I created a /root/.fog
file with the credentials from /var/vcap/jobs/registry/config/registry.yml
.
Then I run:
export PATH=/var/vcap/packages/ruby/bin:$PATH
GEM_HOME=/var/vcap/packages/registry/gem_home
fog
openstack = Fog::Compute.new(:provider => "OpenStack")
openstack.servers
Which returns
<Fog::Compute::OpenStack::Servers
filters={}
[
<Fog::Compute::OpenStack::Server
id="67163032-afb2-4d2f-ae11-fa28e4ed60b0",
instance_name=nil,
addresses={"service"=>[{"version"=>4, "addr"=>"10.200.7.5"}]},
flavor={"id"=>"1004", "links"=>[{"href"=>"http://{ip}/7276102beb584c66a11bb6b923a4d4f1/flavors/1004", "rel"=>"bookmark"}]},
host_id="eb4609af2272e22d5c94d87f7f66a0b9a5e960ec702b752aa8ae6aa3",
image={"id"=>"298505df-476a-45d4-a213-18d6d0224cb3", "links"=>[{"href"=>"http://{ip}/7276102beb584c66a11bb6b923a4d4f1/images/298505df-476a-45d4-a213-18d6d0224cb3", "rel"=>"bookmark"}]},
metadata= <Fog::Compute::OpenStack::Metadata
[]
>,
links=[{"href"=>"http://{ip}/v1.1/7276102beb584c66a11bb6b923a4d4f1/servers/67163032-afb2-4d2f-ae11-fa28e4ed60b0", "rel"=>"self"}, {"href"=>"http://{ip}/7276102beb584c66a11bb6b923a4d4f1/servers/67163032-afb2-4d2f-ae11-fa28e4ed60b0", "rel"=>"bookmark"}],
name="vm-5a7b25dc-8691-4101-ab9b-3697215b1f0d",
personality=nil,
progress=0,
accessIPv4="",
accessIPv6="",
availability_zone=nil,
user_data_encoded=nil,
state="ACTIVE",
created=2013-03-20 15:08:36 UTC,
updated=2013-03-20 15:09:46 UTC,
tenant_id="7276102beb584c66a11bb6b923a4d4f1",
user_id="834d9dc2fb7d4ec5b109103e6649d19d",
key_name="microbosh-openstack",
fault=nil,
os_dcf_disk_config="MANUAL",
os_ext_srv_attr_host=nil,
os_ext_srv_attr_hypervisor_hostname=nil,
os_ext_srv_attr_instance_name=nil,
os_ext_sts_power_state=1,
os_ext_sts_task_state=nil,
os_ext_sts_vm_state="active"
>
]
from bosh.
When running ps aux | grep registry
I get:
vcap 2511 0.4 1.0 134732 43292 ? S<l Mar18 12:09 /var/vcap/data/packages/ruby/3/bin/ruby /var/vcap/packages/registry/bin/bosh_registry -c /var/vcap/jobs/registry/config/registry.yml
When I kill this process all works fine again.
from bosh.
Are you still having similar issues with newer versions of BOSH?
from bosh.
The problem of the registry getting stuck over time is still a issue in newer versions of BOSH.
But when I read back this whole issue it looks like this was not the initial problem being described.
I have no experience with running a multinode openstack so on second thought I think they should be two separate issues. And the debugging information I posted above seems to be more related to #96. Sorry for hijacking this issue.
from bosh.
Regarding the original issue, it has been fixed. The Bosh cck command now allows you to delete the vm reference, so you can recreate it later. I'm closing this issue, if it happens again, feel free to reopen it.
@rkoster Regarding the registry issue, we'll follow it in #96
from bosh.
Related Issues (20)
- Stemcells do not contain an AMI in af-south-1 (Cape Town) and me-south-1 (Bahrain) AWS regions HOT 5
- The AMIs for us-gov-west-1 AWS region are missing HOT 2
- Collection of PRs related to integration of bosh-azure-storage-cli HOT 1
- Is an internal ca & certificate can be used instead of bosh self signed HOT 4
- When deploying bosh on Vsphere, Prompt Cleaning up rendered CPI jobs... Finished HOT 26
- `/metrics` and `/api_metrics` endpoint does not show the generic API metrics for the director's endpoints HOT 5
- Create a Jumpbox and a BOSH Director error HOT 2
- 1 of 2 post-start scripts failed. Failed Jobs: cloud_controller_ng. Successful Jobs: bosh-dns. HOT 1
- Resurrector not resurrecting unresponsive agent. HOT 7
- Multi-cpi with different iaas bosh cpi releases induce ruby package conflict HOT 2
- Default bosh generated x509 certificates have invalid 3 digits USA country code HOT 6
- Support Alibaba OSS as an external blobstore for bosh HOT 5
- Improve support for diagnostics of failed compilation: flag to preserve compilation source packages and logs HOT 2
- How to get vm_cid in VM? HOT 1
- Failed on upgrading BOSH Director from v271.2.0 to v280.0.14 HOT 4
- Non-descriptive error message when a BOSH job spec property name is a prefix for another one HOT 3
- Support for updating disks HOT 4
- Cannot connect to Bosh Director HOT 5
- Retention period of task logs HOT 2
- health_monitor is leaking connections
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bosh.