outscale / osc-bsu-csi-driver Goto Github PK
View Code? Open in Web Editor NEWLicense: Apache License 2.0
License: Apache License 2.0
As a user, I would like my CSI plugin to be stable when API calls are blocked due to throttling (either on the account or the whole platform).
This issue focus on checking that all API calls are resilient to throttling errors and are able to retry using exponential-backoff algorithme.
/kind bug
What happened?
It is possible to handle 39 volumes per node when it is not scsi.
But when it is scsi volumes we get after 36 volumes per node:
2023-05-24T04:02:26Z : Warning : FailedMount : MountVolume.MountDevice failed for volume "pvc-5b47b495-a962-47cb-9deb-4962fb6f7a3b" : rpc error: code = Internal desc = Failed to find device path /dev/xvdaa. scsi path "/dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_sda" not found
What you expected to happen?
To be able to handle 39 volumes for scsi device per node
What happened?
DeleteVolume
and DeleteSnapshot
are not idempotent and the plugin get stuck in infinite deletion loop if the resource does not exist anymore.
What you expected to happen?
All functions should be idempotent.
How to reproduce it (as minimally and precisely as possible)?
Create a Disk/Snapshots and destroy the disk manually
Environment
Is your feature request related to a problem?/Why is this needed
osc-sdk-go v1 is used but has some limitation due to nil management during API queries.
osc-sdk-go v2 provide some new facilities around this issue.
/feature
Describe the solution you'd like in detail
Adapt CSI code to switch to osc-sdk-go.
/kind bug
What happened?
When we request a disk with iops that exceeds Outscale'limit (https://docs.outscale.com/en/userguide/About-Volumes.html#_volume_types_and_iops) which is currently 300
, the plugin does not reduce the iops.
What you expected to happen?
The plugin currently reduces the maximum iops if it exceeds 13000
, it would be good to also check the ratio and reduce it in such case.
/kind bug
What happened?
During CreateSnapshot, CSI will return OK after calling CreateSnapshot (IaaS).
Once CreateSnapshot (CSI) returned OK, the CO now consider that the Snapshot is now "cut" in CSI specification (meaning the Snapshot's content cannot be altered by future writes).
Once the "cut" done, CO may "thaw" application which may continue writing on Volume.
However unlike EC2 behavior where "the point-in-time snapshot is created immediately", Outscale's Snapshot will be cut once the state "completed" is reached on IaaS:
The data contained in a snapshot is considered cut when the snapshot is in the completed state.
This behavior could lead CO to prematurely resume writes on Volume and alter Snapshot content.
What you expected to happen?
As described in CSI spec:
CreateSnapshot is a synchronous call and it MUST block until the snapshot is cut
In the current Outscale API version, CreateSnapshot (CSI) should block until Snapshot (IaaS) state reached "completed".
How to reproduce it (as minimally and precisely as possible)?
creation_time
of the Snapshotcreation_time
Anything else we need to know?:
Note that ready_to_use
still switch to true
once a Snapshot (IaaS) move to "complete" state as Outscale have no post-processing effort (unlike EC2).
🔥IMPLEMENTATION RISK🔥
Waiting for state to reach "complete" could easily timeout CSI calls which is ok as CO will call CreateSnapshot again and again.
If each pending call is not stopped once timeout is reached, each call may continue performing ReadSnapshots (IaaS) in an infinite loop and cause those issues:
Fix implementation should consider exit with an error instead of ReadSnapshot (IaaS) forever (could be a fixed allocated time, could be after first read, ...)
Environment
Currently, the plugin is tested against sanity_test v2.2.0 which is meant to CSI drivers that satisfy v1.1.0 CSI spec.
With version v0.15.0, Outscale driver satifies v1.5.0 CSI spec therefore it should be a good idea to upgrade the sanity test package to v4.3.0
/kind bug
What happened?
-k8s-tag-cluster-id doesn't exist in ebs-plugin but it is possible to set it in helm values (/osc-bsu-csi-driver/values.yaml)
k8sTagClusterId: ""
kubectl logs -f ebs-csi-controller-85f44d455c-4fbjz -n kube-system -c ebs-plugin
flag provided but not defined: -k8s-tag-cluster-id
Usage of aws-ebs-csi-driver:
-add_dir_header
If true, adds the file directory to the header of the log messages
What you expected to happen?
I don't know if the feature is removed or changed
How to reproduce it (as minimally and precisely as possible)?
helm install osc-bsu-csi-driver ./osc-bsu-csi-driver --namespace kube-system --set enableVolumeScheduling=true --set enableVolumeResizing=true --set enableVolumeSnapshot=true --set region=$OSC_REGION --set image.repository=$IMAGE_NAME --set image.tag=$IMAGE_TAG --set k8sTagClusterId="test"
Anything else we need to know?:
Environment
kubectl version
):Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.1", GitCommit:"632ed300f2c34f6d6d15ca4cef3d3c7073412212", GitTreeState:"clean", BuildDate:"2021-08-19T15:45:37Z", GoVersion:"go1.16.7", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.5", GitCommit:"5c99e2ac2ff9a3c549d9ca665e7bc05a3e18f07e", GitTreeState:"clean", BuildDate:"2021-12-16T08:32:32Z", GoVersion:"go1.16.12", Compiler:"gc", Platform:"linux/amd64"}
Tracking issue for:
Is your feature request related to a problem?/Why is this needed
Some deployments need to use a specific client certificate in order to make calls.
For instance, use may want to configure its API Access Rules with certiticates
/feature
The feature would consist to pass an optional client certificate to use through environment variable to CSI.
In a Kubernetes deployment, this certificate should be stored inside a Secret.
Two variables can be configured:
Both variables should be set or none of them. When both are set, CSI will use client-certificate-data and client-key-data to establish connexion to Outscale API.
In order not to miss new arguments, it would be interesting to integrate helm-docs in the release process.
Right now we need to specify topology.bsu.outscale.com/zone
to get the right AZ.
Could we support standard topology.kubernetes.io/zone
?
/kind bug
What happened?
Maximum IOPS is higher than 13000 allowed by Outscale
As a result IOPS are trimmed down to 20000 and request fails - API access logs shows the request created with iopsPerGB: 300
and PVC
of 100Gi
"Logs": [
{
"ResponseStatusCode": 400,
"ResponseSize": 143,
"QueryPayloadRaw": "{\"Iops\":20000,\"Size\":100,\"SubregionName\":\"cloudgouv-eu-west-1a\",\"VolumeType\":\"io1\"}\n",
"AccountId": "XXX",
"QueryUserAgent": "osc-bsu-csi-driver/v1.0.0",
"CallDuration": 34,
"RequestId": "0b5b0926-ad14-4809-8afa-e18350e43de5",
"QueryApiVersion": "1.22",
"QueryIpAddress": "1.2.3.4",
"QueryApiName": "oapi",
"QueryPayloadSize": 84,
"QueryCallName": "CreateVolume",
"QueryAccessKey": "XXX",
"QueryHeaderSize": 351,
"QueryDate": "2022-10-04T10:11:55.587546Z",
"QueryHeaderRaw": "Host: api.cloudgouv-eu-west-1.outscale.com\\nAccept: application/json\\nConnection: close\\nUser-Agent: osc-bsu-csi-driver/v1.0.0\\nX-Amz-Date: 20221004T101155Z\\nX-SSL-CERT: -----BEGIN CERTIFICATE----------END CERTIFICATE-----\\nContent-Type: application/json\\nAuthorization: *****\\nContent-Length: 84\\nAccept-Encoding: gzip\\nX-Forwarded-For: 1.2.3.4"
},
What you expected to happen?
Controller to scale down IOPS to maximum allowed value and create a volume
How to reproduce it (as minimally and precisely as possible)?
allowVolumeExpansion: true
allowedTopologies:
- matchLabelExpressions:
- key: topology.bsu.csi.outscale.com/zone
values:
- cloudgouv-eu-west-1a
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: osc-io1-big
parameters:
iopsPerGB: "300"
type: io1
provisioner: bsu.csi.outscale.com
reclaimPolicy: Delete
volumeBindingMode: Immediate
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: block-claim
spec:
accessModes:
- ReadWriteOnce
volumeMode: Block
storageClassName: osc-io1-big
resources:
requests:
storage: 100Gi
Anything else we need to know?:
Environment
Kubernetes version (use kubectl version
):
Server Version: v1.23.10+rke2r1
Driver version:
v1.0.0
/kind bug
What happened?
The CSI specification impose that all functions must be idempotent.
Issue #130 shows that NodePublishVolume
was not idempotent.
What to do ?
We need to check the idempotency of all functions.
Is your feature request related to a problem?/Why is this needed
Some labels are set with osc.com
which should be changed to outscale.com
as osc.com does not exist.
/feature
Describe the solution you'd like in detail
Adapt label-migration
branch to reflect this change.
Hello,
we can not create IO1 type of storage from our Openshift cluster. we have followed your template to write our yaml file. here is our yaml:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: slow
provisioner: bsu.csi.outscale.com
parameters:
type: io1
iopsPerGB: "10"
fsType: ext4
and here is the error we are receiving:
I0912 12:49:52.488384 1 connection.go:187] GRPC error: rpc error: code = Internal desc = Could not create volume "pvc-36278801-f9a4-436c-a36a-a54f298ccc3a": could not create volume in Outscale: 400 Bad Request
9473I0912 12:49:52.488419 1 controller.go:767] CreateVolume failed, supports topology = true, node selected false => may reschedule = false => state = Finished: rpc error: code = Internal desc = Could not create volume "pvc-36278801-f9a4-436c-a36a-a54f298ccc3a": could not create volume in Outscale: 400 Bad Request
9474I0912 12:49:52.488452 1 controller.go:1074] Final error received, removing PVC 36278801-f9a4-436c-a36a-a54f298ccc3a from claims in progress
9475W0912 12:49:52.488461 1 controller.go:933] Retrying syncing claim "36278801-f9a4-436c-a36a-a54f298ccc3a", failure 3
9476E0912 12:49:52.488475 1 controller.go:956] error syncing claim "36278801-f9a4-436c-a36a-a54f298ccc3a": failed to provision volume with StorageClass "outscale-bsu-io1": rpc error: code = Internal desc = Could not create volume "pvc-36278801-f9a4-436c-a36a-a54f298ccc3a": could not create volume in Outscale: 400 Bad Request
9477I0912 12:49:52.488495 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"kube-system", Name:"test", UID:"36278801-f9a4-436c-a36a-a54f298ccc3a", APIVersion:"v1", ResourceVersion:"424882", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "outscale-bsu-io1": rpc error: code = Internal desc = Could not create volume "pvc-36278801-f9a4-436c-a36a-a54f298ccc3a": could not create volume in Outscale: 400 Bad Request
9478I0912 12:49:52.488673 1 request.go:1181] Request Body: {"count":4,"lastTimestamp":"2023-09-12T12:49:52Z","message":"failed to provision volume with StorageClass "outscale-bsu-io1": rpc error: code = Internal desc = Could not create volume "pvc-36278801-f9a4-436c-a36a-a54f298ccc3a": could not create volume in Outscale: 400 Bad Request"}
9479I0912 12:49:52.488746 1 round_trippers.go:435] curl -v -XPATCH -H "Accept: application/json, /" -H "Content-Type: application/strategic-merge-patch+json" -H "User-Agent: csi-provisioner/v0.0.0 (linux/amd64) kubernetes/$Format" -H "Authorization: Bearer " '[https://172.30.0.1:443/api/v1/namespaces/kube-system/events/test.178427ad569f843a'https://172.30.0.1/api/v1/namespaces/kube-system/events/test.178427ad569f843a%27](https://172.30.0.1/api/v1/namespaces/kube-system/events/test.178427ad569f843a'%3Chttps://172.30.0.1/api/v1/namespaces/kube-system/events/test.178427ad569f843a%27)
9480I0912 12:49:52.496620 1 round_trippers.go:454] PATCH [https://172.30.0.1:443/api/v1/namespaces/kube-system/events/test.178427ad569f843ahttps://172.30.0.1/api/v1/namespaces/kube-system/events/test.178427ad569f843a](https://172.30.0.1/api/v1/namespaces/kube-system/events/test.178427ad569f843a%3Chttps://172.30.0.1/api/v1/namespaces/kube-system/events/test.178427ad569f843a) 200 OK in 7 milliseconds
Erreur Interface graphique :
[cid:5b6909bb-cb85-4156-aa57-8cd6f1080292]
Please can you help us find what is the problem ? i am ready to take a call whenever you want.
Sincerely,
Jordan
/kind bug
What happened?
It seems that when we use xfs as fstype
, allowVolumeExpansion
could not work as in Docker image, we don't have the xfs_growfs
binary.
I encounter this error:
MountVolume.Setup failed while expanding volume for volume "pvc-xxxxxxxxxxxxxxxxx" : Expander.NodeExpand failed to expand the volume : rpc error: code = Internal desc = Could not resize volume "vol-xxxxxxxx" ("/dev/xvdf"): resize of device /var/lib/kubelet/pods/xxxxxxxxxxxxx/volumes/kubernetes.io~csi/pvc-xxxxxxxxxxxxxxxxx/mount failed: executable file not found in $PATH. xfs_growfs output:
Checking the Docker file here https://github.com/outscale/osc-bsu-csi-driver/blob/v1.2.3/Dockerfile#L26 and according to Alpine pkg, you need also xfsprogs-extra
to be able to use xfs_growfs
https://alpine.pkgs.org/3.16/alpine-main-x86_64/xfsprogs-extra-5.16.0-r1.apk.html
Maybe you could also update Alpine release while you are here :-) https://github.com/outscale/osc-bsu-csi-driver/blob/v1.2.3/Dockerfile#L22
Environment
kubectl version
):Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.7", GitCommit:"84e1fc493a47446df2e155e70fca768d2653a398", GitTreeState:"clean", BuildDate:"2023-07-19T12:23:27Z", GoVersion:"go1.20.6", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.7+rke2r1", GitCommit:"84e1fc493a47446df2e155e70fca768d2653a398", GitTreeState:"clean", BuildDate:"2023-07-19T20:19:16Z", GoVersion:"go1.20.6 X:boringcrypto", Compiler:"gc", Platform:"linux/amd64"}
Thanks a lot for your help
/kind bug
What happened?
During installation of driver i have ebs-csi-controller in CrashLoopBackOff in Kubernetes v1.23.5.
I have checked the pod logs and having this trace:
I0707 15:24:13.882407 1 driver.go:63] Driver: ebs.csi.aws.com Version: v0.0.15
panic: could not get metadata from OSC: EC2 instance metadata is not available
goroutine 1 [running]:
github.com/outscale-dev/osc-bsu-csi-driver/pkg/driver.newControllerService(0xc0001de960)
/build/pkg/driver/controller.go:83 +0x85
github.com/outscale-dev/osc-bsu-csi-driver/pkg/driver.NewDriver({0xc000517f58, 0x3, 0x40cef4})
/build/pkg/driver/driver.go:83 +0x579
main.main()
/build/cmd/main.go:31 +0x18f
I've checked my env vars, all seems to be ok (osc-csi-bsu secret is available on namespace):
[...]
env:
- name: CSI_ENDPOINT
value: 'unix:///var/lib/csi/sockets/pluginproxy/csi.sock'
- name: OSC_ACCESS_KEY
valueFrom:
secretKeyRef:
name: osc-csi-bsu
key: access_key
optional: true
- name: OSC_SECRET_KEY
valueFrom:
secretKeyRef:
name: osc-csi-bsu
key: secret_key
optional: true
- name: AWS_REGION
value: eu-west-2
[...]
I do not use any Network Policies on my namespace (i'have seen some thread saying that can be related)
My cluster have acces to internet
What you expected to happen?
EBS-CSI is working well and allow to create pv/pvc
How to reproduce it (as minimally and precisely as possible)?
On fresh Kubernetes, follow steps of this documentation:
https://github.com/outscale-dev/osc-bsu-csi-driver/blob/OSC-MIGRATION/docs/deploy.md
Anything else we need to know?:
Environment
Tracking issue for:
Why is this needed
The sidecar csi-snapshotter
uses a deprecated objects therefore multiple warnings are thrown.
W0119 10:55:42.256252 1 warnings.go:67] snapshot.storage.k8s.io/v1beta1 VolumeSnapshotClass is deprecated; use snapshot.storage.k8s.io/v1 VolumeSnapshotClass
W0119 11:01:41.256348 1 warnings.go:67] snapshot.storage.k8s.io/v1beta1 VolumeSnapshotContent is deprecated; use snapshot.storage.k8s.io/v1 VolumeSnapshotContent
/feature
Describe the solution you'd like in detail
Update all sidecars to get latest features
Is your feature request related to a problem?/Why is this needed
As a user concerned by data storage security, I would like to be able to encrypt my data (either through block or mount) to prevent some case of unauthorized read from BSU volumes.
/feature
Describe the solution you'd like in detail
Be able to provide a secret at storage class declaration similarly to what portworx has done.
Secret can be passed to CSI thanks to secrets requirements.
This issue focus on encryption at storage class level, a future enhancement would be to allow specific secret definition at PersistentVolumeClaim level.
What would you like to be added:
Publish deployment using Kustomize
Why is this needed:
Native deployment engine
/kind feature
/kind bug
What happened?
We experience one time, a pod stuck on terminating state and the volume was not unmount properly.
Idea
Implement a stress test that mount and unmount multiple times to see if we can reproduce it
Hello,
Using latest version available (v1.2.0), I'd like to be sure that I can't expand a PVC while I have a Pod running on it.
If it's correct, is it something that could be implemented some days ?
Thanks for your help
Helm and plugin might sometimes needs to live separatively.
We can take the example of AWS EBS plugin: https://github.com/kubernetes-sigs/aws-ebs-csi-driver/releases
Is your feature request related to a problem?/Why is this needed
As it is not a best practice to use root AK/SK with this tool, we should provide a sample Policy document that could be attached to a less-privileged EIM user.
/feature
Describe the solution you'd like in detail
A sample EIM policy should be present in the README, along with the recommendation to bind this policy to an EIM user dedicated for the BSU driver.
Describe alternatives you've considered
Precisely listing the privileges required for managing the BSUs, but it may be over-complicated for new-comers. A sample EIM policy is a good starting point.
Additional context
Sample policy that could be given to users:
{
"Statement": [
{
"Action": [
"api:ReadVms",
"api:ReadVmsState",
"api:ReadSnapshots",
"api:CreateSnapshot",
"api:DeleteSnapshot",
"api:CreateVolume",
"api:ReadVolumes",
"api:LinkVolume",
"api:UnlinkVolume",
"api:UpdateVolume",
"api:DeleteVolume"
],
"Resource": [
"*"
],
"Effect": "Allow"
}
]
}
Why is this needed
CRD and snapshot control are cluster responsibility to maintain as specified in kubernetes-csi
/feature
Describe the solution you'd like in detail
Remove it from the helm chart
/kind bug
What happened?
disk is not allocated
What you expected to happen?
disk to be allocated
How to reproduce it (as minimally and precisely as possible)?
try to create a disk on a cluster in cloudgouv-eu-west-1 region
Anything else we need to know?:
driver is given a "not Authorized" error with reference to eu-west-2 region
Environment
kubectl version
): 1.23.14Is your feature request related to a problem?/Why is this needed
Main branch currently use "ebs"/"aws" labels in provider description / definition.
label-migration
adapt this to more accurate "bsu"/"outscale" labels but this branch introduce breaking change.
A solution would be to explain how to migrate from old csi to new csi plugin.
/documentation
Describe the solution you'd like in detail
A safe migration would be to:
What would you like to be added:
We would probably need to update to go 1.17 as mentioned in:
This would also be a good opportunity to update go dependencies.
Why is this needed:
Keep up-to-date with ecosystem dependencies
/kind feature
Tracking issue for:
/kind bug
What happened?
I encountered a bug following a permission problem. I had a wrongly configured EIM profile for OAPI (following upgrade from FCU API, I was missing actual api: permissions). BSU CSI driver was then unable to see volumes for a time.
However, it tried to manage instances and volume attachments at this time, and was expectedly unable to do so correctly.
When re-establishing correct credentials, I had a buggy VolumeAttachment resource, which was in an inconsistent state.
status:
attachError:
message: 'rpc error: code = NotFound desc = Instance "i-xxxxxxxx" not found'
time: "2021-02-23T17:37:57Z"
attached: false
detachError:
message: 'rpc error: code = Internal desc = Could not detach volume "vol-xxxxxxxx"
from node "i-xxxxxxxx": could not detach volume "vol-xxxxxxxx" from node "i-xxxxxxxx":
409 Conflict'
time: "2021-02-23T18:19:28Z"
At IaaS level, the BSU was then detached.
From the logs, the BSU CSI driver tried to forcibly de-attach the device, presumably to make certain that it was not attached as an error before re-creating the attachment.
The BSU driver then goes along this code path: https://github.com/outscale-dev/osc-bsu-csi-driver/blob/OSC-MIGRATION/pkg/cloud/cloud.go#L525 and tries to detach an already detached volume. It fails, and the driver then enters a loop of failed Detach.
What you expected to happen?
Even if the initial situation occured because a configuration error, the driver should converge successfully after re-establishment of the correct configuration.
How to reproduce it (as minimally and precisely as possible)?
I have not a clear scenario to make this happen to be honest. Perhaps manually detaching a volume at IaaS could trigger the behavior.
Anything else we need to know?:
I suggest that when IaaS returns that the disk is not attached to any instance, we should not try the UnlinkVolume call, and consider the Detach already done. It could be done I think by adding a successful return after https://github.com/outscale-dev/osc-bsu-csi-driver/blob/27ea8b5107143776b0cca0479e861a45d5ac8564/pkg/cloud/cloud.go#L526.
Environment
kubectl version
): 1.18.16hello,
We are creating storage class (IO1, GP2 and standard) from our Openshift cluster.
Unfortunately, right now we are only able to create it through public endpoint.
Can you provide us a private endpoint that achieves the creation of those storages class without going to internet ?
Sincerely,
Jordan
What would you like to be added:
Arrange documentation to split between topics:
docs/README.md
: general explanation and pointers to other documentations (internal or external)docs/deploying.md
: how to install, configure and remove ccmdocs/testing.md
: how to test CCMdocs/contributing.md
: general information round contributions like how to releaseWhy is this needed:
Make life easier of new users / developers
/kind documentation
Why is this needed
In order to distinguish APi calls from multiple services, user-agent is used to detect it
/feature
Describe the solution you'd like in detail
Set user-agent in the header to CSI
Is your feature request related to a problem?/Why is this needed
Various containers are currently set to verbosity level 10. It may be practical for debugging, however it is very verbose for one part, and for the other the debug output contains potentially sensitive information, that should not be enabled except for actual troubleshooting.
/feature
Describe the solution you'd like in detail
Verbosity level should be configurable at deployment time, and not be 10 by default.
Describe alternatives you've considered
Additional context
Info about Kubernetes debug levels:
https://kubernetes.io/docs/reference/kubectl/cheatsheet/#kubectl-output-verbosity-and-debugging
Outscale documentation has moved to docs.outscale.com and a lot of links are still pointing to wiki.outscale.net.
This need to be fixed.
/kind bug
Hello,
using version v1.1.0, sym linking https://github.com/outscale-dev/osc-bsu-csi-driver/blob/v1.1.0/osc-bsu-csi-driver/CHANGELOG.md to https://github.com/outscale-dev/osc-bsu-csi-driver/blob/v1.1.0/CHANGELOG-1.X.md seems not supported using Helm.
Here the output using simple helm commands:
gbellongervais@me:~/outscale/osc-k8s-rke-cluster$ helm plugin install https://github.com/aslafy-z/helm-git --version 0.14.0
Installed plugin: helm-git
gbellongervais@me:~/outscale/osc-k8s-rke-cluster$ helm repo add osc git+https://www.github.com/outscale-dev/osc-bsu-csi-driver/@osc-bsu-csi-driver?ref=v1.1.0
Error: error evaluating symlink /tmp/helm-git.M9JJkZ/osc-bsu-csi-driver/CHANGELOG.md: lstat /tmp/helm-git.M9JJkZ/CHANGELOG-1.X.md: no such file or directory
Error: looks like "git+https://www.github.com/outscale-dev/osc-bsu-csi-driver/@osc-bsu-csi-driver?ref=v1.1.0" is not a valid chart repository or cannot be reached: plugin "helm-git" exited with error
I have the same issue using https://github.com/outscale-dev/osc-k8s-rke-cluster with updated addons version:
gbellongervais@me:~/outscale/osc-k8s-rke-cluster$ ANSIBLE_CONFIG=ansible.cfg ansible-playbook addons/csi/playbook.yaml
PLAY [Setup OSC-CSI] ***********************************************************************************************************************************************************************************************************************
TASK [Gathering Facts] *********************************************************************************************************************************************************************************************************************
ok: [localhost]
TASK [download helm] ***********************************************************************************************************************************************************************************************************************
changed: [localhost]
TASK [Install Helm-git] ********************************************************************************************************************************************************************************************************************
changed: [localhost]
TASK [Add Outscale repository] *************************************************************************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": true, "cmd": ["$HELM_BIN", "repo", "add", "osc", "git+https://www.github.com/outscale-dev/osc-bsu-csi-driver/@osc-bsu-csi-driver?ref=v1.1.0"], "delta": "0:00:01.579029", "end": "2022-12-15 18:15:26.963329", "msg": "non-zero return code", "rc": 1, "start": "2022-12-15 18:15:25.384300", "stderr": "Error: error evaluating symlink /tmp/helm-git.PJGIwI/osc-bsu-csi-driver/CHANGELOG.md: lstat /tmp/helm-git.PJGIwI/CHANGELOG-1.X.md: no such file or directory\nError: looks like \"git+https://www.github.com/outscale-dev/osc-bsu-csi-driver/@osc-bsu-csi-driver?ref=v1.1.0\" is not a valid chart repository or cannot be reached: plugin \"helm-git\" exited with error", "stderr_lines": ["Error: error evaluating symlink /tmp/helm-git.PJGIwI/osc-bsu-csi-driver/CHANGELOG.md: lstat /tmp/helm-git.PJGIwI/CHANGELOG-1.X.md: no such file or directory", "Error: looks like \"git+https://www.github.com/outscale-dev/osc-bsu-csi-driver/@osc-bsu-csi-driver?ref=v1.1.0\" is not a valid chart repository or cannot be reached: plugin \"helm-git\" exited with error"], "stdout": "", "stdout_lines": []}
PLAY RECAP *********************************************************************************************************************************************************************************************************************************
localhost : ok=3 changed=2 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
I'm on Ubuntu 22.04.1 LTS
(WSL2 version on Windows 11 but not sure it is related)
When the disk is already detached, the plugin is stucked like that:
{"Vms":[{"VmType":"tinav4.c32r64p1","VmInitiatedShutdownBehavior":"stop","State":"running","StateReason":"","RootDeviceType":"ebs","RootDeviceName":"/dev/sda1","IsSourceDestChecked":true,"KeypairName":"thanos","ImageId":"ami-1cda4f98","DeletionProtection":false,"VmId":"i-a7d27e54","ReservationId":"r-fade62d8","Hypervisor":"xen","CreationDate":"2022-03-03T13:42:50.624Z","UserData":"I2Nsb3VkLWNvbmZpZwoKeXVtX3JlcG9zOgogIGVwZWwtcmVsZWFzZToKICAgIGJhc2V1cmw6IGh0dHA6Ly9kb3dubG9hZC5mZWRvcmFwcm9qZWN0Lm9yZy9wdWIvZXBlbC83LyRiYXNlYXJjaAogICAgZW5hYmxlZDogdHJ1ZQogICAgZmFpbG92ZXJtZXRob2Q6IHByaW9yaXR5CiAgICBncGdjaGVjazogdHJ1ZQogICAgZ3Bna2V5OiBodHRwOi8vZG93bmxvYWQuZmVkb3JhcHJvamVjdC5vcmcvcHViL2VwZWwvUlBNLUdQRy1LRVktRVBFTC03CiAgICBuYW1lOiBFeHRyYSBQYWNrYWdlcyBmb3IgRW50ZXJwcmlzZSBMaW51eCA3IC0gUmVsZWFzZQogIHNhbHRzdGFjazoKICAgIGJhc2V1cmw6IGh0dHBzOi8vcmVwby5zYWx0cHJvamVjdC5pby9weTMvcmVkaGF0LzcveDg2XzY0LzMwMDIvCiAgICBlbmFibGVkOiB0cnVlCiAgICBmYWlsb3Zlcm1ldGhvZDogcHJpb3JpdHkKICAgIGdwZ2NoZWNrOiB0cnVlCiAgICBncGdrZXk6IGh0dHBzOi8vcmVwby5zYWx0cHJvamVjdC5pby9weTMvcmVkaGF0LzcveDg2XzY0LzMwMDIvU0FMVFNUQUNLLUdQRy1LRVkucHViCiAgICBuYW1lOiBTYWx0U3RhY2sgUmVwbyAzMDAwLjIKCiAgCnBhY2thZ2VzOgogIC0gaHRvcAogIC0gaW90b3AKICAtIGlmdG9wCiAgLSB2aW0KICAtIHNhbHQtbWluaW9uCgp3cml0ZV9maWxlczoKLSBjb250ZW50OiB8CiAgICBoYXNoX3R5cGU6IHNoYTI1NgogICAgaWQ6ICJwYXIxLWNsb3VkLXByb20td29ya2VyLTEiCiAgICBsb2dfbGV2ZWw6IGluZm8KICAgIG1hc3RlcjogMTAuMjQuMS41CiAgcGF0aDogL2V0Yy9zYWx0L21pbmlvbgotIGNvbnRlbnQ6IHwKICAgICJwYXIxLWNsb3VkLXByb20td29ya2VyLTEiCiAgcGF0aDogL2V0Yy9zYWx0L21pbmlvbl9pZAoKcnVuY21kOgogIC0gc3VkbyBob3N0bmFtZWN0bCBzZXQtaG9zdG5hbWUgInBhcjEtY2xvdWQtcHJvbS13b3JrZXItMSIKICAtIFsgc3lzdGVtY3RsLCBkYWVtb24tcmVsb2FkIF0KICAtIFsgc3lzdGVtY3RsLCBlbmFibGUsIHNhbHQtbWluaW9uIF0KICAtIFsgc3lzdGVtY3RsLCBzdGFydCwgLS1uby1ibG9jaywgc2FsdC1taW5pb24gXQo=","SubnetId":"subnet-a5e344cc","PrivateIp":"10.24.0.8","SecurityGroups":[{"SecurityGroupName":"eu-west-2-common","SecurityGroupId":"sg-6c9713c0"}],"BsuOptimized":false,"BlockDeviceMappings":[{"DeviceName":"/dev/sda1","Bsu":{"VolumeId":"vol-b5afd0ce","State":"attached","LinkDate":"2022-05-31T07:44:46.279Z","DeleteOnVmDeletion":false}}],"ProductCodes":["0001"],"Placement":{"Tenancy":"default","SubregionName":"eu-west-2a"},"Architecture":"x86_64","NestedVirtualization":false,"LaunchNumber":0,"NetId":"vpc-96a7ffe2","Nics":[{"SubnetId":"subnet-a5e344cc","AccountId":"542438614293","Description":"Primary network interface","IsSourceDestChecked":true,"PrivateDnsName":"ip-10-24-0-8.eu-west-2.compute.internal","State":"in-use","LinkNic":{"State":"attached","LinkNicId":"eni-attach-1648dc3d","DeviceNumber":0,"DeleteOnVmDeletion":true},"SecurityGroups":[{"SecurityGroupName":"eu-west-2-common","SecurityGroupId":"sg-6c9713c0"}],"MacAddress":"aa:e8:ef:ea:79:42","NetId":"vpc-96a7ffe2","NicId":"eni-b15daae5","PrivateIps":[{"PrivateDnsName":"ip-10-24-0-8.eu-west-2.compute.internal","PrivateIp":"10.24.0.8","IsPrimary":true}]}],"Performance":"highest","Tags":[{"Value":"par1-cloud-prom-worker-1","Key":"Name"},{"Value":"10.24.1.5","Key":"saltmaster"}],"PrivateDnsName":"ip-10-24-0-8.eu-west-2.compute.internal"}],"ResponseContext":{"RequestId":"4878a62e-1ecd-44fa-a5af-b3bd8c868057"}}
I1017 13:27:57.093845 1 cloud.go:995] Debug response DescribeInstances: response({ResponseContext:0xc0006a0250 Vms:0xc0004f5ce0}), err(<nil>), httpRes(&{200 OK 200 HTTP/1.1 1 1 map[Access-Control-Allow-Origin:[*] Connection:[keep-alive] Content-Length:[3173] Content-Type:[application/json] Date:[Mon, 17 Oct 2022 13:27:57 GMT] Referrer-Policy:[same-origin] Server:[nginx] Strict-Transport-Security:[max-age=31536000; includeSubdomains;] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[1; mode=block]] {{"Vms":[{"VmType":"tinav4.c32r64p1","VmInitiatedShutdownBehavior":"stop","State":"running","StateReason":"","RootDeviceType":"ebs","RootDeviceName":"/dev/sda1","IsSourceDestChecked":true,"KeypairName":"thanos","ImageId":"ami-1cda4f98","DeletionProtection":false,"VmId":"i-a7d27e54","ReservationId":"r-fade62d8","Hypervisor":"xen","CreationDate":"2022-03-03T13:42:50.624Z","UserData":"I2Nsb3VkLWNvbmZpZwoKeXVtX3JlcG9zOgogIGVwZWwtcmVsZWFzZToKICAgIGJhc2V1cmw6IGh0dHA6Ly9kb3dubG9hZC5mZWRvcmFwcm9qZWN0Lm9yZy9wdWIvZXBlbC83LyRiYXNlYXJjaAogICAgZW5hYmxlZDogdHJ1ZQogICAgZmFpbG92ZXJtZXRob2Q6IHByaW9yaXR5CiAgICBncGdjaGVjazogdHJ1ZQogICAgZ3Bna2V5OiBodHRwOi8vZG93bmxvYWQuZmVkb3JhcHJvamVjdC5vcmcvcHViL2VwZWwvUlBNLUdQRy1LRVktRVBFTC03CiAgICBuYW1lOiBFeHRyYSBQYWNrYWdlcyBmb3IgRW50ZXJwcmlzZSBMaW51eCA3IC0gUmVsZWFzZQogIHNhbHRzdGFjazoKICAgIGJhc2V1cmw6IGh0dHBzOi8vcmVwby5zYWx0cHJvamVjdC5pby9weTMvcmVkaGF0LzcveDg2XzY0LzMwMDIvCiAgICBlbmFibGVkOiB0cnVlCiAgICBmYWlsb3Zlcm1ldGhvZDogcHJpb3JpdHkKICAgIGdwZ2NoZWNrOiB0cnVlCiAgICBncGdrZXk6IGh0dHBzOi8vcmVwby5zYWx0cHJvamVjdC5pby9weTMvcmVkaGF0LzcveDg2XzY0LzMwMDIvU0FMVFNUQUNLLUdQRy1LRVkucHViCiAgICBuYW1lOiBTYWx0U3RhY2sgUmVwbyAzMDAwLjIKCiAgCnBhY2thZ2VzOgogIC0gaHRvcAogIC0gaW90b3AKICAtIGlmdG9wCiAgLSB2aW0KICAtIHNhbHQtbWluaW9uCgp3cml0ZV9maWxlczoKLSBjb250ZW50OiB8CiAgICBoYXNoX3R5cGU6IHNoYTI1NgogICAgaWQ6ICJwYXIxLWNsb3VkLXByb20td29ya2VyLTEiCiAgICBsb2dfbGV2ZWw6IGluZm8KICAgIG1hc3RlcjogMTAuMjQuMS41CiAgcGF0aDogL2V0Yy9zYWx0L21pbmlvbgotIGNvbnRlbnQ6IHwKICAgICJwYXIxLWNsb3VkLXByb20td29ya2VyLTEiCiAgcGF0aDogL2V0Yy9zYWx0L21pbmlvbl9pZAoKcnVuY21kOgogIC0gc3VkbyBob3N0bmFtZWN0bCBzZXQtaG9zdG5hbWUgInBhcjEtY2xvdWQtcHJvbS13b3JrZXItMSIKICAtIFsgc3lzdGVtY3RsLCBkYWVtb24tcmVsb2FkIF0KICAtIFsgc3lzdGVtY3RsLCBlbmFibGUsIHNhbHQtbWluaW9uIF0KICAtIFsgc3lzdGVtY3RsLCBzdGFydCwgLS1uby1ibG9jaywgc2FsdC1taW5pb24gXQo=","SubnetId":"subnet-a5e344cc","PrivateIp":"10.24.0.8","SecurityGroups":[{"SecurityGroupName":"eu-west-2-common","SecurityGroupId":"sg-6c9713c0"}],"BsuOptimized":false,"BlockDeviceMappings":[{"DeviceName":"/dev/sda1","Bsu":{"VolumeId":"vol-b5afd0ce","State":"attached","LinkDate":"2022-05-31T07:44:46.279Z","DeleteOnVmDeletion":false}}],"ProductCodes":["0001"],"Placement":{"Tenancy":"default","SubregionName":"eu-west-2a"},"Architecture":"x86_64","NestedVirtualization":false,"LaunchNumber":0,"NetId":"vpc-96a7ffe2","Nics":[{"SubnetId":"subnet-a5e344cc","AccountId":"542438614293","Description":"Primary network interface","IsSourceDestChecked":true,"PrivateDnsName":"ip-10-24-0-8.eu-west-2.compute.internal","State":"in-use","LinkNic":{"State":"attached","LinkNicId":"eni-attach-1648dc3d","DeviceNumber":0,"DeleteOnVmDeletion":true},"SecurityGroups":[{"SecurityGroupName":"eu-west-2-common","SecurityGroupId":"sg-6c9713c0"}],"MacAddress":"aa:e8:ef:ea:79:42","NetId":"vpc-96a7ffe2","NicId":"eni-b15daae5","PrivateIps":[{"PrivateDnsName":"ip-10-24-0-8.eu-west-2.compute.internal","PrivateIp":"10.24.0.8","IsPrimary":true}]}],"Performance":"highest","Tags":[{"Value":"par1-cloud-prom-worker-1","Key":"Name"},{"Value":"10.24.1.5","Key":"saltmaster"}],"PrivateDnsName":"ip-10-24-0-8.eu-west-2.compute.internal"}],"ResponseContext":{"RequestId":"4878a62e-1ecd-44fa-a5af-b3bd8c868057"}}} 3173 [] false false map[] 0xc000838600 0xc00057a630})
W1017 13:27:57.093949 1 cloud.go:570] DetachDisk called on non-attached volume: vol-bd1eea74
2022/10/17 13:27:57
POST /api/v1/UnlinkVolume HTTP/1.1
Host: api.eu-west-2.outscale.com
User-Agent: osc-bsu-csi-driver/
Content-Length: 28
Accept: application/json
Authorization: AWS4-HMAC-SHA256 Credential=/20221017/eu-west-2/oapi/aws4_request, SignedHeaders=accept;content-type;host;x-amz-date, Signature=442d830c639266f4ef2bb83d2d6aaf4ceebcc92e1dbd4e73e6c4df376a80c332
Content-Type: application/json
X-Amz-Date: 20221017T132757Z
Accept-Encoding: gzip
{"VolumeId":"vol-bd1eea74"}
2022/10/17 13:27:57
HTTP/1.1 400 Bad Request
Content-Length: 179
Access-Control-Allow-Origin: *
Connection: keep-alive
Content-Type: application/json
Date: Mon, 17 Oct 2022 13:27:57 GMT
Server: nginx
{"Errors":[{"Type":"InvalidResource","Details":"The VolumeId 'vol-bd1eea74' doesn't exist.","Code":"5064"}],"ResponseContext":{"RequestId":"d7b87e6f-a7b3-4c15-9491-8bbbc3d8e7e7"}}
I1017 13:27:57.134696 1 cloud.go:579] Debug response DetachVolume: response({ResponseContext:<nil>}), err(400 Bad Request) httpRes(&{400 Bad Request 400 HTTP/1.1 1 1 map[Access-Control-Allow-Origin:[*] Connection:[keep-alive] Content-Length:[179] Content-Type:[application/json] Date:[Mon, 17 Oct 2022 13:27:57 GMT] Server:[nginx]] {{"Errors":[{"Type":"InvalidResource","Details":"The VolumeId 'vol-bd1eea74' doesn't exist.","Code":"5064"}],"ResponseContext":{"RequestId":"d7b87e6f-a7b3-4c15-9491-8bbbc3d8e7e7"}}} 179 [] false false map[] 0xc00057fc00 0xc00057a630})
400 Bad Request
E1017 13:27:57.134785 1 driver.go:112] GRPC error: rpc error: code = Internal desc = Could not detach volume "vol-bd1eea74" from node "i-a7d27e54": could not detach volume "vol-bd1eea74" from node "i-a7d27e54": 400 Bad Request / (<nil>)
Hello,
We would like to know if it is possible for us to have access to our created OMI on cockpit v1. We have access through cockpit v2 but it is unpleasant to switch our interface often to see our OMI.
Sincerely,
Jordan
Is your feature request related to a problem? Please describe.
I would like to use osc-bsu-csi-driver behind an HTTP proxy. A possible way to do this is to pass environment variables like https_proxy
to the BSU CSI Driver containers. However, this is not supported today, and forces me to work around by patching the Chart.
Describe the solution you'd like in detail
Two enhancements could be done:
extraEnv
suggestion below).Additional context
Helm charts typically have knobs like extraEnv
in the values.yaml
to set static extra environment variables in the Deployments, StatefulSets, etc.
Some variables could contain secrets, and therefore often an extraSecretEnv
is also offered, which manages a dedicated secret and its presentation.
Example overrides in a chart: https://github.com/hashicorp/vault-helm/blob/main/values.yaml#L511
What happened?
The name allocator only check devices from a
to z
and if something wrong happens with some disk we could have a shortage of name like this:
failed to attach: rpc error: code = Internal desc = Could not attach volume "vol-X" to node "i-X": could not get a free device name to assign to node i-X
What you expected to happen?
The API and Linux allows names like /dev/xvdYZ with Y and Z belonging to [a,z]. We need to update these allocator to handle more device name.
How to reproduce it (as minimally and precisely as possible)?
Have a 25 devices used and try to attach a disk
Environment
Is your feature request related to a problem?/Why is this needed
Deploying CSI is easy but still require few manual operations.
Newcomers may be interested to easily install CSI through artifacthub.io
/feature
Describe the solution you'd like in detail
Publish Outscale CSI on operatorhub.io
What would you like to be added:
As a user and developper, I would like to be able to trace csi calls on Outscale API.
This would include kubernetes version and csi version.
Why is this needed:
Add more details on User-Agent.
Should be set in both branches (osc-migration and label-migration) in newOscCloud
/kind feature
/kind bug
What happened?
When reaching the limit of the maximum number of volume attached to one node, we experience some issue on the attachment.
This behavior happens from time to time.
Idea
Add a stress test for this.
Tracking issue for:
I'm looking for an archive of osc-bsu-csi-driver chart to download, the last version did not produce any assets. For a comparison cloud-provider-osc assets is made available for each release https://github.com/outscale/cloud-provider-osc/releases/tag/v0.2.0
It would be very helpful to have such assets
Hi, ext4 is used as default. In Outscale cloud, we would recommend to use xfs for better block usage with snapshotting.
Tracking issue for:
Since Kubernetes v1.17, CSI driver can be topology aware.
By answering to the RPC NodeGetInfo with accessible_topology
, the external-provisioner side-cars will always emit a CreateVolume
with an aggregate topology. Thus, the CSI just need to get the requisite or prefered topology nd create a volume with the right zone.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.