Comments (19)
@cdutra can you take a look at this?
from bosh-vsphere-cpi-release.
@ronakbanka did you figure out if something is killing disks from iaas side?
from bosh-vsphere-cpi-release.
@cppforlife didn't find any such thing which is killing disks , but this started happening after our IaaS upgraded from 5.5 to 6. I have attached few logs from vsphere side
from bosh-vsphere-cpi-release.
@ronakbanka how frequently does this happen? almost immediately and consistently? we run lifecycle tests against vsphere 6.0 env which seems to spin up multiple vms per stemcells.
from bosh-vsphere-cpi-release.
Lifecycle specs run the following tags: https://github.com/cloudfoundry-incubator/bosh-vsphere-cpi-release/blob/e51dcb3d77f28397a7529991228e166acab650e5/ci/pipeline.yml#L64
In the Spec Helper for integration tests we upload a stemcell once: https://github.com/cloudfoundry-incubator/bosh-vsphere-cpi-release/blob/e51dcb3d77f28397a7529991228e166acab650e5/src/vsphere_cpi/spec/integration/spec_helper.rb#L17-L21
All create_vm
calls from the CPI will use that stemcell for the rest of the Rspec run. This should exercise creating multiple VMs from a single stemcell image.
from bosh-vsphere-cpi-release.
@cppforlife this happens very frequently (every time a VM is deleted/recreated) in our setup. Do you guys test against multiple clusters?
edit: apparently it also happens when targeting a single cluster
from bosh-vsphere-cpi-release.
@zaksoup Issue is not during create_vm stage but when a vm is being deleted , vsphere is deleting the .vmdk from base stemcell datastore dir , so next time during create_vm stage it fails as no base virtual disk.
We are experiencing this in vsphere 6 env but not in vsphere 5.5 .
from bosh-vsphere-cpi-release.
@ronakbanka @CAFxX just created a story to try this manually: https://www.pivotaltracker.com/story/show/143189927.
from bosh-vsphere-cpi-release.
@cppforlife maybe underline in the story that this happens with create-delete-create cycle (see screenshots and repro instructions in #22 (comment))
from bosh-vsphere-cpi-release.
Hey @ronakbanka @CAFxX @liuweichu, I am working with @cunnie to reproduce your issue but we're having trouble. Can you please provide us with a copy of your (redacted) manifest and cloud-config?
Here's what we've done so far:
- Deploy a bosh director using the latest
bosh2
CLI to a vSphere 6.0 with this manifest (generated via bosh-deployment) - set the following cloud-config
azs: - name: z1 cloud_properties: datacenters: - clusters: [scarlet-1: {}] vm_types: - name: default az: z1 cloud_properties: cpu: 2 ram: 1024 disk: 3240 disk_types: - name: default disk_size: 3000 networks: - name: default type: manual subnets: - range: 10.xx.yy.0/24 gateway: 10.xx.yy.1 reserved: [10.xx.yy.2-10.xx.yy.249] az: z1 dns: [8.8.8.8] cloud_properties: name: scarlet compilation: workers: 5 reuse_compilation_vms: true az: z1 vm_type: default network: default
- Upload the latest vSphere Ubuntu Stemcell
- Deploy the following manifest
--- name: repro-issue releases: [] stemcells: - alias: ubuntu os: ubuntu-trusty version: latest instance_groups: - name: dummy azs: [z1] instances: 2 vm_type: default persistent_disk_type: default stemcell: ubuntu networks: - name: default default: - dns - gateway # - name: transformers jobs: [] update: canaries: 1 max_in_flight: 6 serial: false canary_watch_time: 1000-60000 update_watch_time: 1000-60000
- Delete the deployment
- Redeploy.
We were hoping that deleting the deployment and redeploying would exercise the create delete recreate cycle but we see the second deployment complete successfully.
Here is what the stemcell looks like in our vSphere:
Thanks!
Zak + Brian
from bosh-vsphere-cpi-release.
@zaksoup @cunnie I am using bosh create-env with init configuration , issue is still there ,but i noticed one spec difference when asked out IaaS guys to give hostd logs from 2 vsphere env.
There is a difference in deviceChange.backing.parent
, in vsphere 5.5 it is null where as in in vsphere 6 it points to stemcell vmdk , and the vmdk delete is occurring in vsphere 6 where this property is set in config.
one below is from vsphere 5.5
2017-04-05T06:58:39.746Z [3ED40B70 verbose 'Vcsvc.LLPM' opID=276a9a3b-16 user=vpxuser] LLPM reconfig (vim.vm.ConfigSpec) {
--> dynamicType = <unset>,
--> changeVersion = <unset>,
--> name = <unset>,
--> version = <unset>,
--> uuid = <unset>,
--> instanceUuid = <unset>,
--> npivWorldWideNameType = <unset>,
--> npivDesiredNodeWwns = <unset>,
--> npivDesiredPortWwns = <unset>,
--> npivTemporaryDisabled = <unset>,
--> npivOnNonRdmDisks = <unset>,
--> npivWorldWideNameOp = <unset>,
--> locationId = <unset>,
--> guestId = <unset>,
--> alternateGuestName = <unset>,
--> annotation = <unset>,
--> files = (vim.vm.FileInfo) {
--> dynamicType = <unset>,
--> vmPathName = "[]/vmfs/volumes/53e466e2-887e4d38-b813-6c3be5b6d3f0/vm-2e860675-de64-4607-935a-d92022c1455c/vm-2e860675-de64-4607-935a-d92022c1455c.vmx",
--> snapshotDirectory = <unset>,
--> suspendDirectory = <unset>,
--> logDirectory = <unset>,
--> ftMetadataDirectory = <unset>,
--> },
--> tools = (vim.vm.ToolsConfigInfo) null,
--> flags = (vim.vm.FlagInfo) null,
--> consolePreferences = (vim.vm.ConsolePreferences) null,
--> powerOpInfo = (vim.vm.DefaultPowerOpInfo) null,
--> numCPUs = <unset>,
--> numCoresPerSocket = <unset>,
--> memoryMB = <unset>,
--> memoryHotAddEnabled = <unset>,
--> cpuHotAddEnabled = <unset>,
--> cpuHotRemoveEnabled = <unset>,
--> virtualICH7MPresent = <unset>,
--> virtualSMCPresent = <unset>,
--> deviceChange = (vim.vm.device.VirtualDeviceSpec) [
--> (vim.vm.device.VirtualDeviceSpec) {
--> dynamicType = <unset>,
--> operation = "remove",
--> fileOperation = "destroy",
--> device = (vim.vm.device.VirtualDisk) {
--> dynamicType = <unset>,
--> key = 2000,
--> deviceInfo = (vim.Description) {
--> dynamicType = <unset>,
--> label = "Hard disk 1",
--> summary = "3,145,728 KB",
--> },
--> backing = (vim.vm.device.VirtualDisk.FlatVer2BackingInfo) {
--> dynamicType = <unset>,
--> fileName = "[]/vmfs/volumes/53e466e2-887e4d38-b813-6c3be5b6d3f0/vm-2e860675-de64-4607-935a-d92022c1455c/vm-2e860675-de64-4607-935a-d92022c1455c_1.vmdk",
--> datastore = 'vim.Datastore:datastore-11786',
--> backingObjectId = "111-2000-0",
--> diskMode = "persistent",
--> split = false,
--> writeThrough = false,
--> thinProvisioned = false,
--> eagerlyScrub = <unset>,
--> uuid = "6000C294-5167-070d-b38d-bfa7cadbb5e2",
--> contentId = "166b589dceb61a8900586e0dd12f5239",
--> changeId = <unset>,
--> parent = (vim.vm.device.VirtualDisk.FlatVer2BackingInfo) null,
--> deltaDiskFormat = "redoLogFormat",
--> digestEnabled = false,
--> deltaGrainSize = <unset>,
--> },
--> connectable = (vim.vm.device.VirtualDevice.ConnectInfo) null,
--> slotInfo = (vim.vm.device.VirtualDevice.BusSlotInfo) null,
--> controllerKey = 1000,
--> unitNumber = 0,
--> capacityInKB = 3145728,
--> capacityInBytes = 3221225472,
--> shares = (vim.SharesInfo) {
--> dynamicType = <unset>,
--> shares = 1000,
--> level = "normal",
--> },
--> storageIOAllocation = (vim.StorageResourceManager.IOAllocationInfo) {
--> dynamicType = <unset>,
--> limit = 500,
--> shares = (vim.SharesInfo) {
--> dynamicType = <unset>,
--> shares = 1000,
--> level = "normal",
--> },
--> reservation = 0,
--> },
--> diskObjectId = "111-2000",
--> vFlashCacheConfigInfo = (vim.vm.device.VirtualDisk.VFlashCacheConfigInfo) null,
--> },
--> },
One from vsphere 6 is
2017-03-10T07:16:21.716Z [2BF80B70 verbose 'Vcsvc.LLPM' opID=4155057b-bd user=vpxuser:VSPHERE.LOCAL\rpaas] LLPM reconfig (vim.vm.ConfigSpec) {
--> dynamicType = <unset>,
--> changeVersion = <unset>,
--> name = <unset>,
--> version = <unset>,
--> uuid = <unset>,
--> instanceUuid = <unset>,
--> npivWorldWideNameType = <unset>,
--> npivDesiredNodeWwns = <unset>,
--> npivDesiredPortWwns = <unset>,
--> npivTemporaryDisabled = <unset>,
--> npivOnNonRdmDisks = <unset>,
--> npivWorldWideNameOp = <unset>,
--> locationId = <unset>,
--> guestId = <unset>,
--> alternateGuestName = <unset>,
--> annotation = <unset>,
--> files = (vim.vm.FileInfo) {
--> dynamicType = <unset>,
--> vmPathName = "[]/vmfs/volumes/5507b83b-7346c089-d2ed-0090fa8aa610/vm-12344265-b3ee-4eb9-923c-8a455d975240/vm-12344265-b3ee-4eb9-923c-8a455d975240.vmx",
--> snapshotDirectory = <unset>,
--> suspendDirectory = <unset>,
--> logDirectory = <unset>,
--> ftMetadataDirectory = <unset>,
--> },
--> tools = (vim.vm.ToolsConfigInfo) null,
--> flags = (vim.vm.FlagInfo) null,
--> consolePreferences = (vim.vm.ConsolePreferences) null,
--> powerOpInfo = (vim.vm.DefaultPowerOpInfo) null,
--> numCPUs = <unset>,
--> numCoresPerSocket = <unset>,
--> memoryMB = <unset>,
--> memoryHotAddEnabled = <unset>,
--> cpuHotAddEnabled = <unset>,
--> cpuHotRemoveEnabled = <unset>,
--> virtualICH7MPresent = <unset>,
--> virtualSMCPresent = <unset>,
--> deviceChange = (vim.vm.device.VirtualDeviceSpec) [
--> (vim.vm.device.VirtualDeviceSpec) {
--> dynamicType = <unset>,
--> operation = "remove",
--> fileOperation = "destroy",
--> device = (vim.vm.device.VirtualDisk) {
--> dynamicType = <unset>,
--> key = 2000,
--> deviceInfo = (vim.Description) {
--> dynamicType = <unset>,
--> label = "Hard disk 1",
--> summary = "3,145,728 KB",
--> },
--> backing = (vim.vm.device.VirtualDisk.FlatVer2BackingInfo) {
--> dynamicType = <unset>,
--> fileName = "[]/vmfs/volumes/5507b83b-7346c089-d2ed-0090fa8aa610/vm-12344265-b3ee-4eb9-923c-8a455d975240/vm-12344265-b3ee-4eb9-923c-8a455d975240.vmdk",
--> datastore = 'vim.Datastore:datastore-1211',
--> backingObjectId = "725-2000-0",
--> diskMode = "persistent",
--> split = false,
--> writeThrough = false,
--> thinProvisioned = false,
--> eagerlyScrub = <unset>,
--> uuid = "6000C29b-233c-e226-9381-34bf9fc78093",
--> contentId = "b0e8b00c8d1601f4a87255eba20e04db",
--> changeId = <unset>,
--> parent = (vim.vm.device.VirtualDisk.FlatVer2BackingInfo) {
--> dynamicType = <unset>,
--> fileName = "[]/vmfs/volumes/5507b83b-7346c089-d2ed-0090fa8aa610/sc-af1c2748-6463-4cc7-8696-ddcb53b2270c/sc-af1c2748-6463-4cc7-8696-ddcb53b2270c.vmdk",
--> datastore = 'vim.Datastore:datastore-1211',
--> backingObjectId = "725-2000-1",
--> diskMode = "persistent",
--> split = <unset>,
--> writeThrough = <unset>,
--> thinProvisioned = false,
--> eagerlyScrub = <unset>,
--> uuid = "6000C29b-233c-e226-9381-34bf9fc78093",
--> contentId = "b0e8b00c8d1601f4a87255eba20e04db",
--> changeId = <unset>,
--> parent = (vim.vm.device.VirtualDisk.FlatVer2BackingInfo) null,
--> deltaDiskFormat = <unset>,
--> digestEnabled = <unset>,
--> deltaGrainSize = <unset>,
--> },
--> deltaDiskFormat = "redoLogFormat",
--> digestEnabled = false,
--> deltaGrainSize = <unset>,
--> },
--> connectable = (vim.vm.device.VirtualDevice.ConnectInfo) null,
--> slotInfo = (vim.vm.device.VirtualDevice.BusSlotInfo) null,
--> controllerKey = 1000,
--> unitNumber = 0,
--> capacityInKB = 3145728,
--> capacityInBytes = 3221225472,
--> shares = (vim.SharesInfo) {
--> dynamicType = <unset>,
--> shares = 1000,
--> level = "normal",
--> },
--> storageIOAllocation = (vim.StorageResourceManager.IOAllocationInfo) {
--> dynamicType = <unset>,
--> limit = -1,
--> shares = (vim.SharesInfo) {
--> dynamicType = <unset>,
--> shares = 1000,
--> level = "normal",
--> },
--> reservation = 0,
--> },
--> diskObjectId = "725-2000",
--> vFlashCacheConfigInfo = (vim.vm.device.VirtualDisk.VFlashCacheConfigInfo) null,
--> },
--> },
from bosh-vsphere-cpi-release.
@ronakbanka Thank you for the hostd output.
Could you please provide the bash steps you ran and a redacted manifest, state.json file, and cloud-config? Are you doing a bosh create-env
followed by a bosh delete-env
and the problem appears when you run bosh create-env
again? Or do you deploy a director with bosh create-env
and then do a bosh deploy
followed by a bosh delete-deployment
and then with a bosh deploy
again?
We are unable to reproduce your problem. If you could provide an example set of manifests and an example bash script to demonstrate that would be incredibly useful. Very explicit and specific steps to reproduce will be helpful.
This problem is interesting and we want to make sure we help you out and fix any bugs that might be causing this quickly.
from bosh-vsphere-cpi-release.
@ronakbanka Hey, we're still very interested in understanding what's causing your problem! Can you please provide the bash steps you ran and a redacted manifest, state.json file, and cloud-config?
from bosh-vsphere-cpi-release.
@zaksoup @cunnie We just moved to a new cluster with same version 6 and now we are not experiencing this strange issue . There was some db issues on our previous cluster , was not able to find if it was related to stemcell issue in a way.
can close this now and will see it it happens again.
from bosh-vsphere-cpi-release.
@ronakbanka sounds good. please re-open this issue if you can reproduce it!
from bosh-vsphere-cpi-release.
Hi We're experiencing similar problem after we upgraded to vSphere 6.0 from vSphere 5.5, however we're not creating new environment. We're just scaling down some Diego VMs. If we scale up then there is no issue, however when we scale down it throws error:
Unknown CPI error 'Unknown' with message 'Invalid configuration for device '0'.' in 'delete_vm' CPI method.
Further retries of deploy attempts fail as well.
One workaround we could find is delete the VM via BOSH CCK then deploy again.
Following is the last SOAP Request/Response in the task logs:
= Request\n\nPOST /sdk/vimService HTTP/1.1\r\nSOAPAction: "urn:vim25/5.1"\r\nAccept-Encoding: gzip, deflate\r\nContent-Type: text/xml; charset=UTF-8\r\nCookie: vmware_soap_session=ca6r4ac1a908638470229bc37ca4a4d59fc055ca\r\nUser-Agent: HTTPClient/1.0 (2.7.1, ruby 2.2.4 (2015-12-16))\r\nAccept: /\r\nDate: Fri, 17 Nov 2017 09:16:49 GMT\r\nContent-Length: 726\r\nHost: vSphereHost\r\n\r\n\n<soapenv:Envelope xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/\" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/\" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance\" xmlns:xsd="http://www.w3.org/2001/XMLSchema\">\nsoapenv:Body<RetrievePropertiesEx xmlns="urn:vim25"><_this type="PropertyCollector">propertyCollector</_this>Taskfalseinfo.progressinfo.stateinfo.resultinfo.error<obj type="Task">task-82346false</soapenv:Body>\n</soapenv:Envelope>
\n\n= Response\n\nHTTP/1.1 200 OK\r\nDate: Fri, 17 Nov 2017 09:16:49 GMT\r\nCache-Control: no-cache\r\nConnection: Keep-Alive\r\nContent-Type: text/xml; charset=utf-8\r\nX-Frame-Options: DENY\r\nContent-Length: 853\r\n\r\n\n<soapenv:Envelope xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/\"\n xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/\"\n xmlns:xsd="http://www.w3.org/2001/XMLSchema\"\n xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance\">\nsoapenv:Body\n<RetrievePropertiesExResponse xmlns="urn:vim25"><obj type="Task">task-82346info.error<val xsi:type="LocalizedMethodFault"><fault xsi:type="InvalidDeviceSpec">virtualDeviceSpec.operation0Invalid configuration for device '0'.info.state<val xsi:type="TaskInfoState">error\n</soapenv:Body>\n</soapenv:Envelope>
from bosh-vsphere-cpi-release.
This may be related to user permissions (i.e. logging in not as [email protected]
).
from bosh-vsphere-cpi-release.
@cunnie : Did you come across this issue recently? Which vCenter and which CPI version?
from bosh-vsphere-cpi-release.
@EleanorRigby — I haven't run across this issue in 2 years.
from bosh-vsphere-cpi-release.
Related Issues (20)
- Yet another test issue
- testing again
- BOSH Vsphere CPI APIs to automatically add diego cell or BOSH VM on which garden containers can be deployed on . HOT 1
- Failed to create multi-cluster configuration based on cloud-config azs HOT 2
- create_vm fails with postgres SQL error with CPI v72 HOT 6
- create_vm fails with postgres SQL error with CPI v72 HOT 9
- bump to ruby 3.1.x for upcoming jammy release HOT 2
- `golang-1-darwin` package is unused HOT 10
- Migration from NSX-V to NSX-T HOT 2
- Bosh sometimes silently fails to add a vm to a group when using nsxt policy provider HOT 2
- TAS deployment failing - CPI error message 'Client 'admin' exceeded request rate of 100 per second' at Creating missing stage HOT 4
- Airgap compile (still) not working in v78 (release notes claim fixed in this release) HOT 2
- TAS deployment failure during network churn HOT 1
- effective_memory is possibly not the correct metric to check HOT 1
- Add config to apply tags to stemcells HOT 4
- Breaking change to NSX-T group lookup by ID instead of name with no fallback (?) HOT 8
- Unable to upload Jammy stemcell - "Permission to perform this operation was denied" HOT 2
- Support for vSphere 8 HOT 2
- Incorrect go platform binary used when compiling iso9660wrap HOT 2
- Offline deployment broken for v93 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bosh-vsphere-cpi-release.