Giter Club home page Giter Club logo

csm's Introduction

csm's People

Contributors

abhi16394 avatar adarsh-dell avatar alikdell avatar atye avatar bharathsreekanth avatar boyamurthy avatar delldubey avatar gallacher avatar hoppea2 avatar jackieung-dell avatar jooseppi-luna avatar nitesh3108 avatar panigs7 avatar prablr79 avatar santhoshatdell avatar shanmydell avatar sharmilarama avatar shaynafinocchiaro avatar tdawe avatar vamsisiddu-7 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

csm's Issues

[BUG]: Isilon CSI driver repots: Error: Authorization required # DISCONNECTED SITE

Describe the bug
Following Operator and CSI driver upgrade we notice the following errors on the Isilon controller:
"Error in response Method DELETE URI:platform/2/protocols/nfs/exports Error: Authorization required JSON Error: Authorization required"
Old operator version 1.2
Old CSI driver version: 1.4
New Operator version 1.5
New CSI driver version 2.0.0

Following this error message
the CSI driver send an authentication request with the user and password, authentication succeed and the delete operation proceeding with no issue.
Next request will trigger another authentication error and the process follow.

The error will spawn for any type of request not only delete so basically the entire log is loaded with authentication issues and success
The time it takes for the operation to complete results in pods to timeout due to pvc access issues.

Expected behavior
A bearer token to be saved once and sent with each new request so the driver wont need to authenticate for any new request.

Screenshots
Disconnected site

Logs
Disconnected site

System Information (please complete the following information):
OpenShift: 4.6.5
K8s: 1.19
Dell Operator 1.5
CSI driver 2.0.0
Isilon version 8.2.2.0
Isilon RUP 01-2021

[FEATURE]: CSM Replication: repctl UX improvements

Redesign repctls look&fell to improve UX, based on feedback

  • Use get instead of list
  • The to-cluster argument in execute-action naming is a not well understandable approach. Need to investigate and come up with proper naming/architecture for executeAction.
    -Use the --wait argument for the execute actions command so that repctl will wait for the action to complete.

Add support for NodeGetVolumeStats RPC for kubelet metrics

Hi Dell folks,

we're using the Unity CSI driver and are missing metrics. Please add support to the NodeGetVolumeStats RPC according to the CSI spec.

This allows the kubelet to query the CSI plugin for a PVC’s status.
The kubelet then exposes that information in kubelet_volume_stats_* metrics.

best regards

[FEATURE]:CSM reporting metrics for PV/PVC for PowerFlex

As a K8s admin, I want to view the pvc / pv metrics reporting back to kubelet via the CSI driver.

In addition, I want to verify that the volume is still provisioned properly on the array (exists and is exported to the node) and that it is still accessible and correctly mounted at the indicated volume path.

[FEATURE]: Update Storage Class for Pods where a vTree storage migration has occurred.

Describe the solution you'd like
Should a pod's volume be migrated by way of PowerFlex manager UI or CLI, the Pod retains the name of the original storage class name. Would be nice to see the storageclass change once a successful vTree migration has occurred.

Describe alternatives you've considered

Additional context
Customers interested in migrating data between various storage classes or the need to migrate a volume from a storage class that will be sunset due to older nodes. Upon migrating to a new system, the Pod should be updated to reflect the storage class change.

[FEATURE]: Review logging throughout the CSM-Replication code

Current logging of csm-replication often does not help to understand a problem, we need to review messages and logging levels across the whole csm-replication and corresponding csi-drivers and fix (re-imagine) the logging approach.
Tips:

  • We want to see functions input and output
  • We want to """log.WithFields""" as much as possible
  • We should always use correct log levels
  • No "UnknowErrors"
  • Error messages should be properly worded and should include some sort of a simplified trace.

[FEATURE]: Security Feature for Mount request

  1. Request Feature
    Karavi Authorization should have a authorization feature regarding the mount request.
    The Kubernetes Pods should not be able to mount PVs which are claimed by the other k8s clusters,
    so the Karavi Authorization should deny the existing PV mount request from the Pods which do not call PVC for target PV.

  2. Issue
    In the multi-tenancy CaaS environment , there are no feature of tenant isolation on the CSI Block storage based on Dell EMC Storage. So logically, the Pods can mount PV which is provisioned by the PVC which the other k8s cluster issued. This would be security whole.
    This will prevent to be chosen Dell EMC storage for the MEC in 5G platform or CSP's CaaS Service Platform.

[FEATURE]: Install/Upgrade Support via Helm for Authorization module sidecar

Description
Installation and upgrade is supported via Helm for the Authorization module sidecar. This will be incorporated in the following CSI Driver Helm charts:

  • Dell EMC CSI Driver for PowerFlex
  • Dell EMC CSI Driver for PowerMax
  • Dell EMC CSI Driver for PowerScale

In addition to this, the Authorization module proxy server will support upgrade via RPM.

[FEATURE]: Add version option in KARAVICTL

Describe the solution you'd like
Add a kavarictl version to check the version of the binary as well as the service similar to what kubectl version does.

Describe alternatives you've considered
Right now, I have an error from the command line but I cannot say if this is due to an incompatible version of the binary versus the karavi service.

[FEATURE]: CSM reporting metrics for PV/PVC for Isilon/PowerScale

As a K8s admin, I want to view the pvc / pv metrics reporting back to kubelet via the CSI driver.

In addition, I want to verify that the volume is still provisioned properly on the array (exists and is exported to the node) and that it is still accessible and correctly mounted at the indicated volume path.

[FEATURE]: Pass Volume Name off to Storage System

Describe the solution you'd like
The myvalues (values).yaml enables the naming convention of volumes carved up by the CSI. K8S is the default. This can be changed at deployment of the CSI. It could be helpful to allow the alternate name to be passed off to the storage array.

Describe alternatives you've considered
Would require the storage admin to manually change the name of the volume in the UI of PowerFlex. If deploying a volume(s) for a specific app, would be good to pass this off automatically.

Additional context
N/A

[FEATURE]: Storage Volume Multi-Tenancy Support for Unity

Unity storage array supports IP Multi-tenancy feature to assign isolated, file-based storage partitions to the NAS servers on a storage processor.
Through this feature implementation in CSI Unity Driver, customers will now be able to associate Tenant with storage volumes.

Additional reference: https://www.dell.com/community/Containers/CSI-dynamic-provisioning-in-a-multitenancy-model/td-p/7476098

Acceptance Criteria: Storage provisioned and associated with Hosts, specific to the Tenant in kubernetes cluster.

[FEATURE]: Support SINGLE_NODE_SINGLE_WRITER and SINGLE_NODE_MULTI_WRITER modes for Unity

As a K8s user, I want CSI drivers to support latest spec, so I could use the new features.

As part of this feature, need to support following CSI 1.5 spec for Unity CSI Driver:

  • SINGLE_NODE_SINGLE_WRITER and SINGLE_NODE_MULTI_WRITER modes

Acceptance Criteria: k8s pods will support ReadWriteOncePod and ReadWriteOnce as part of CSI 1.5 spec
For k8s 1.21 and below, access modes supported currently will still remain intact

[FEATURE]: CSM reporting metrics for PV/PVC for PowerStore

As a K8s admin, I want to view the pvc / pv metrics reporting back to kubelet via the CSI driver.

In addition, I want to verify that the volume is still provisioned properly on the array (exists and is exported to the node) and that it is still accessible and correctly mounted at the indicated volume path.

[FEATURE]: Easier deployment of Authorization.

Describe the solution you'd like
Easier deployment of Authorization. Introduce an alternative to deploy as a container (pod) where upon deployment, prompts or values.yaml can be generated to connect to PowerFlex system to enable access controls for storage consumption.

This would cut down on having another vm to deploy, and would decrease overall time to deploy.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

[FEATURE]: Display available capacity

As K8s admin, I want to monitor the use of the available storage capacity for my cluster, so I can plan the capacity

Available capacity should be expressed in storage units.

[FEATURE]: Trigger alarms when volume access latency exceeds norms

Describe the solution you'd like
I'm submitting this on behalf of Itzik as it came up in a meeting this morning.
He described the need for have some kind of running averages on the latency for I/O operations to complete on the array, perhaps broken down by node, or storage pool, or for a particular volume, or overall on the array. The metrics would keep a history of past metrics (perhaps an Exponential Moving Average that weights recent usage more highly than distant past) and if the latency for the item exceeded that norm (the moving average) by some percentage trigger an alarm event of some kind (perhaps a grafana alarm).

Describe alternatives you've considered
Similar facilities are generally already available in CloudIQ and the various array User Interfaces. However they do not report to the kubernetes admins. Additionally this might allow some kind of kubernetes automation to be build around the alarm.

Additional context
This is an enhancement, not to be considered a bug. We can discuss priority and possible implementations.

[QUESTION]: Delay occurs in the elimination of luns attached to pods


How can the Team help you today?

Delay occurs in the elimination of luns attached to pods


Details: ?
Where this issue occurs:
• Cluster OCP NOPROD of Cuyo
• With Driver CSI Unity 1.6

• PVC Name csivol-05e3f78392
• wwn: 0x60060160acd04e00ebc46d61578d24f2
• SO: Red Hat Enterprise Linux CoreOS release 4.7
• UNITY SV:5.0.1.0.5.011
• CSI Driver Version: 1.6.0

Problem that occurs:
• The problem occurs when a POD with a Unity iSCSI PV terminates

• And immediately a new POD is raised using the same PV, this takes about 5 minutes to lift, indicating problems with attaching the PV (this because the PV from unity still says it belongs to another Pod)

Node driver log during deleted POD:

time="2021-10-19T14:00:47Z" level=info runid=361 msg="Executing NodeGetCapabilities with args: {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}" func="github.com/dell/csi-unity/service.(*service).NodeGetCapabilities()" file="/go/src/csi-unity/service/node.go:838"
time="2021-10-19T14:00:58Z" level=info msg="unmount command" cmd=umount path="/var/lib/kubelet/pods/4cddd421-88f0-415f-a8a4-f7aa8e86a066/volumes/kubernetes.io~csi/csivol-05e3f78392/mount"
time="2021-10-19T14:00:58Z" level=info runid=365 msg="Executing NodeGetCapabilities with args: {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}" func="github.com/dell/csi-unity/service.(*service).NodeGetCapabilities()" file="/go/src/csi-unity/service/node.go:838"
time="2021-10-19T14:00:59Z" level=info msg="unmount command" cmd=umount path=/var/lib/kubelet/plugins/kubernetes.io/csi/pv/csivol-05e3f78392/globalmount
time="2021-10-19T14:00:59Z" level=info msg="Check for disk path /dev/disk/by-id/dm-uuid-mpath-360060160acd04e00ebc46d61578d24f2 found: /dev/dm-1"
time="2021-10-19T14:00:59Z" level=info arrayid=ckm00195201796 runid=368 msg="resolve wwn for DM: dm-1" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:00:59Z" level=info arrayid=ckm00195201796 runid=368 msg="multipath - get block device included in DM" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:00:59Z" level=info arrayid=ckm00195201796 runid=368 msg="get devices by wwn 360060160acd04e00ebc46d61578d24f2" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:00:59Z" level=info arrayid=ckm00195201796 runid=368 msg="devices for WWN 360060160acd04e00ebc46d61578d24f2: [sdg sdk sdl sdm]" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:00:59Z" level=info arrayid=ckm00195201796 runid=368 msg="multipath - trying to find multipath DM name" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:00:59Z" level=info arrayid=ckm00195201796 runid=368 msg="multipath - start flush dm: /dev/dm-1" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:00:59Z" level=info arrayid=ckm00195201796 runid=368 msg="multipath command: chroot args: /noderoot multipath -f /dev/dm-1" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:01:27Z" level=info runid=369 msg="Executing NodeGetCapabilities with args: {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}" func="github.com/dell/csi-unity/service.(*service).NodeGetCapabilities()" file="/go/src/csi-unity/service/node.go:838"

Log node driver elapsed LUN elimination.

time="2021-10-19T14:01:59Z" level=info runid=370 msg="Executing NodeGetCapabilities with args: {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}" func="github.com/dell/csi-unity/service.(*service).NodeGetCapabilities()" file="/go/src/csi-unity/service/node.go:838"
time="2021-10-19T14:02:58Z" level=error arrayid=ckm00195201796 runid=368 msg="failed to flush multipath device: signal: killed" func="github.com/dell/csi-unity/service.(*customLogger).Error()" file="/go/src/csi-unity/service/service.go:714"
time="2021-10-19T14:02:59Z" level=info runid=371 msg="Executing NodeGetCapabilities with args: {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}" func="github.com/dell/csi-unity/service.(*service).NodeGetCapabilities()" file="/go/src/csi-unity/service/node.go:838"
time="2021-10-19T14:02:59Z" level=info runid=372 msg="Executing NodeGetCapabilities with args: {XXX_NoUnkeyedLiteral:{} XXX_unrecognized:[] XXX_sizecache:0}" func="github.com/dell/csi-unity/service.(*service).NodeGetCapabilities()" file="/go/src/csi-unity/service/node.go:838"
time="2021-10-19T14:02:59Z" level=info msg="Check for disk path /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2 found: /dev/sdm"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="get devices by wwn 360060160acd04e00ebc46d61578d24f2" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="devices for WWN 360060160acd04e00ebc46d61578d24f2: [sdg sdk sdl sdm]" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="multipath - trying to find multipath DM name" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="multipath device not found: dm not found" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="device state is: running" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="writing '1' to /sys/class/block/sdg/device/delete" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="device state is: running" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="writing '1' to /sys/class/block/sdk/device/delete" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="device state is: running" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:02:59Z" level=info arrayid=ckm00195201796 runid=368 msg="writing '1' to /sys/class/block/sdl/device/delete" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:03:00Z" level=info arrayid=ckm00195201796 runid=368 msg="device state is: running" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:03:00Z" level=info arrayid=ckm00195201796 runid=368 msg="writing '1' to /sys/class/block/sdm/device/delete" func="github.com/dell/csi-unity/service.(*customLogger).Info()" file="/go/src/csi-unity/service/service.go:706"
time="2021-10-19T14:03:01Z" level=info msg="Check for disk path /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2 not found"
time="2021-10-19T14:03:02Z" level=info msg="Check for disk path /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2 not found"
time="2021-10-19T14:03:02Z" level=info msg="Check for disk path /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2 not found"

log pod when we want to recreate the pod with the same pvc and this is taken:

Normal Scheduled Successfully assigned movistar-3scale/simple-volume-pod-example-04 to ocpnp-7tcvg-worker-0-mtx7v
Warning FailedMount 3m11s kubelet, ocpnp-7tcvg-worker-0-mtx7v MountVolume.SetUp failed for volume "csivol-05e3f78392" : rpc error: code = Internal desc = runid=182 error publish volume to target path: mount failed: exit status 32
mounting arguments: -o bind /var/lib/kubelet/plugins/kubernetes.io/csi/pv/csivol-05e3f78392/globalmount /var/lib/kubelet/pods/5e8918ab-c55b-431b-ba9f-b1e7a232e6fa/volumes/kubernetes.iocsi/csivol-05e3f78392/mount
output: mount: /var/lib/kubelet/pods/5e8918ab-c55b-431b-ba9f-b1e7a232e6fa/volumes/kubernetes.io
csi/csivol-05e3f78392/mount: special device /var/lib/kubelet/plugins/kubernetes.io/csi/pv/csivol-05e3f78392/globalmount does not exist.
Warning FailedMount 3m10s kubelet, ocpnp-7tcvg-worker-0-mtx7v MountVolume.SetUp failed for volume "csivol-05e3f78392" : rpc error: code = NotFound desc = runid=184 Disk path not found. Error: readlink /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2: no such file or directory
Warning FailedMount 3m9s kubelet, ocpnp-7tcvg-worker-0-mtx7v MountVolume.SetUp failed for volume "csivol-05e3f78392" : rpc error: code = NotFound desc = runid=188 Disk path not found. Error: readlink /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2: no such file or directory
Warning FailedMount 3m7s kubelet, ocpnp-7tcvg-worker-0-mtx7v MountVolume.SetUp failed for volume "csivol-05e3f78392" : rpc error: code = NotFound desc = runid=192 Disk path not found. Error: readlink /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2: no such file or directory
Warning FailedMount 3m2s kubelet, ocpnp-7tcvg-worker-0-mtx7v MountVolume.SetUp failed for volume "csivol-05e3f78392" : rpc error: code = NotFound desc = runid=197 Disk path not found. Error: readlink /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2: no such file or directory
Warning FailedMount 2m54s kubelet, ocpnp-7tcvg-worker-0-mtx7v MountVolume.SetUp failed for volume "csivol-05e3f78392" : rpc error: code = NotFound desc = runid=201 Disk path not found. Error: readlink /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2: no such file or directory
Warning FailedMount 2m37s kubelet, ocpnp-7tcvg-worker-0-mtx7v MountVolume.SetUp failed for volume "csivol-05e3f78392" : rpc error: code = NotFound desc = runid=206 Disk path not found. Error: readlink /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2: no such file or directory
Warning FailedMount 2m5s kubelet, ocpnp-7tcvg-worker-0-mtx7v MountVolume.SetUp failed for volume "csivol-05e3f78392" : rpc error: code = NotFound desc = runid=214 Disk path not found. Error: readlink /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2: no such file or directory
Warning FailedMount (x2 over 2m8s) kubelet, ocpnp-7tcvg-worker-0-mtx7v Unable to attach or mount volumes: unmounted volumes=[example-04], unattached volumes=[example-04 default-token-f2f56]: timed out waiting for the condition
Warning FailedMount (x2 over 60s) kubelet, ocpnp-7tcvg-worker-0-mtx7v (combined from similar events): MountVolume.SetUp failed for volume "csivol-05e3f78392" : rpc error: code = NotFound desc = runid=231 Disk path not found. Error: readlink /dev/disk/by-id/wwn-0x60060160acd04e00ebc46d61578d24f2: no such file or directory

[FEATURE]: Delete PowerFlex SDC ID when Worker is Removed from Cluster

Describe the solution you'd like
For PowerFlex systems, where SDCs are deployed on k8s workers, if the worker is permanently removed from the cluster this creates an orphaned SDC ID (GUID)

SDC ID: d5d2119f00000025 Name: N/A IP: N/A State: Disconnected GUID:

While a new worker can be deployed and can take on the IP of that worker which had been deleted, there remains the artifact of that removed workers GUID.

The ability to, when removing a worker or destroying a k8s cluster, to pass off to the PowerFlex gateway all the SDC IDs to be removed so as to decrease the alerts flagged in PFxM / Presentation server.

Currently, storage administrators of the PowerFlex must manually go in and remove these orphaned SDC ID's.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

[BUG]: PowerStore CSI Driver report iSCSI target error while using FC only

Describe the bug
A clear and concise description of what the bug is.

PowerStore CSI driver is configured for FC connectivity, however its reporting "“AttachVolume.Attach failed for volume "csi-pstore-167d09d5dd" : rpc error: code = Internal desc = could not get iscsiTargets: can't get iscsi target address”" when using CSI to provision PVC from PowerStore.

Issue only resolved by config CSI with a fake iSCSI targets, then FC connectivity would work.

Looking at the code of "addTargetsInfoToPublishContext()", it seems the code would always check iSCSI target first then FC, if there is no ISCSI target configured, the provision would exit with error, it would not proceed with FC target check. This is not ideal, the code should skip ISCSI target check and proceed with FC targets check, if no ISCSI targets configured.

func (s *SCSIPublisher)
addTargetsInfoToPublishContext(
publishContext map[string]string, client gopowerstore.Client) error {
iscsiTargetsInfo, err := common.GetISCSITargetsInfoFromStorage(client)
if err != nil {
return err
}
for i, t := range iscsiTargetsInfo {
publishContext[fmt.Sprintf("%s%d", common.PublishContextISCSIPortalsPrefix, i)] = t.Portal
publishContext[fmt.Sprintf("%s%d", common.PublishContextISCSITargetsPrefix, i)] = t.Target
}
fcTargetsInfo, err := common.GetFCTargetsInfoFromStorage(client)
if err != nil {
return err
}
for i, t := range fcTargetsInfo {
publishContext[fmt.Sprintf("%s%d", common.PublishContextFCWWPNPrefix, i)] = t.WWPN
}
return nil
}

To Reproduce
Steps to reproduce the behavior:

  1. Step 1 configure CSI with FC connectivity without configuring iSCSI targets
  2. Step 2 try provision FC based PVC from Powerstore to pod
    n See error
    "“AttachVolume.Attach failed for volume "csi-pstore-167d09d5dd" : rpc error: code = Internal desc = could not get iscsiTargets: can't get iscsi target address”"

Expected behavior
User should be able to provision FC based PVC from Powerstore without configuring iSCSI targets.
The code of "addTargetsInfoToPublishContext()" should skip ISCSI target check and proceed with FC targets check if no ISCSI targets configured.

Screenshots
If applicable, add screenshots to help explain your problem.

Logs
If applicable, submit logs or stack traces from the affected services

System Information (please complete the following information):

  • OS/Version: [e.g. RHEL 7.6]
  • Kubernetes Version [e.g. 1.18]
  • Additional Information...

Additional context
Add any other context about the problem here.

[FEATURE]: Improve karavi-metrics logs to display the number of pulled metrics

Describe the solution you'd like
Current logs give time to collect and push data :

2021/01/12 10:47:04 Looking up system ID 31ba0b0125517f0f Name powerflex01
2021/01/12 10:47:04 gatherMetrics took 2.25µs
2021/01/12 10:47:04 pushMetrics took 812ns
2021/01/12 10:47:04 GetSDCStatistics took 22.456348ms
2021/01/12 10:47:04 gatherPoolStatistics took 3.107µs
2021/01/12 10:47:04 pushPoolStatistics took 1.026µs
2021/01/12 10:47:04 gatherPoolStatistics took 1.742µs
2021/01/12 10:47:04 pushPoolStatistics took 729ns
2021/01/12 10:47:04 GetStoragePoolStatistics took 49.552945ms

While debugging an issue with the otel-collector, I was unsure of the actual metrics pushed.

Is it possible to add to this log the number of pulled/pushed metrics. For example:

2021/01/12 10:47:04 gatherMetrics XXX took 2.25µs

where XXX is the number of metrics.

If below version is support CSI for Powerstore

Dears:

We have a customer want to install the csi for powerstore. Would you pls help to check if below software version support or not?Tks.

PowerStore: 2.0.1.0
OS Version: RedHat 7.6
Kernel Version : 54.145-1.el7.elrepo.x86_64
Kubernetes: 1.20.9
Rancher: 2.5.9
Docker version 20.10.6, build 370c289
calico: 3.17.2
etcd: 3.4.15
nginx-ingress: 0.43.0

[BUG]: Issue creating volume from Isilon snapshot

Describe the bug
Issue seen when creating a volume from an Isilon snapshot. The snapshot got created correctly.
I will forward the logs and other information by emal.
To Reproduce
Steps to reproduce the behavior:

  1. Create a snapshot (successfully).
  2. Create a volume from the snapshot. Fails.
  3. Step 3 ...
    ...
    n. Step n See error time="2021-10-26T20:56:24Z" level=error msg="copy snapshot failed, 'Unable to open object

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Logs
Submitted via email.

System Information (please complete the following information):

  • OS/Version: [e.g. RHEL 7.6] Openshift
  • Kubernetes Version [e.g. 1.18]
  • Additional Information...

Additional context
Add any other context about the problem here.

[BUG]: When VolumeGroup Snapshot ReadyToUse, the VolumeSnapshot don't have Status

When VolumeGroup Snapshot Ready, the VolumeSnapshot don't have Status

Steps to reproduce the behavior:

  • Using dellemc/csi-volumegroup-snapshotter:v0.2.0, I created VolumeGroup Snapshot of 3 PVCs with same label "volume-group=volumeGroup1" using controller code.
  • After the DellCsiVolumeGroupSnapshot.Status ReadyToUse is true, the code tried to fetch the VolumeSnapshot and found that it misses the Status field.
    VG Snapshot's Status: {
    "snapshotGroupID": "217b52c106138b0f-c56b7d7100000003",
    "snapshotGroupName": "volumegroup1-102021-184622",
    "snapshots": "volumegroup1-102021-184622-0-pvol0,volumegroup1-102021-184622-1-pvol1,volumegroup1-102021-184622-2-pvol2",
    "creationTime": "2021-10-20T18:45:25Z",
    "readyToUse": true,
    "status": "Complete"
    }
    ...
    Missing Status of volumesnapshot 'volumegroup1-102021-184622-2-pvol2' in ns 'helmtest-vxflexos'.

[BUG]: Topology based scheduling does not work when recreating Statefulsets

Describe the bug
Topology based scheduling is not working when recreating Statefulsets.
To Reproduce
Steps to reproduce the behavior:

  1. Create StorageClass with
allowedTopologies:
- matchLabelExpressions:
  - key: "csi-vxflexos.dellemc.com/340906774c30210f
    values:
    - csi-vxflexos.dellemc.com

and

volumeBindingMode: WaitForFirstConsumer

NOTE - Powerflex is deployed/installed only on a subset of nodes in the cluster
2. Create a Statefulset without nodeAffinity, pods are scheduled on the right nodes
3. Delete the Statefulset created in step2.
4. Recreate the Statefulset created in step2.
5. The pods are scheduled on nodes where powerflex is not installed

Expected behavior
Pods are scheduled on nodes where Powerflex is installed, even after re-creating the statefulset

Logs
Controller logs - https://gist.github.com/skavyas/6e155d1cff3dc0c75ff039d0142b80a4

System Information (please complete the following information):

  • OS/Version: [e.g. RHEL 7.9]
  • Kubernetes Version [e.g. 1.19]

[QUESTION]:Facing an issue during deployment of PV/PVC with CSI 2.0.

How can the Team help you today?

Facing an issue during deployment of PV/PVC with CSI 2.0
Environment details ;

  1. EKS Anywhere with VM OS ubuntu-v1.21.2-kubernetes-1-21-eks-4-amd64-0f34334 deployment
  2. CSI 2.0 deployed.
    I am getting the below error when I deploying the App with PV /PVC with CSI 2.0.

MountVolume.SetUp failed for volume "k8s-e331a62c87" : rpc error: code = Internal desc = error performing private mount: mount failed: exit status 32
mounting arguments: -t xfs -o nouuid,fsFormatOption:xfs,defaults /dev/scinia /var/lib/kubelet/plugins/vxflexos.emc.dell.com/disks/12dbccc1024d0a0f-72a87c1b00000006
output: mount: /var/lib/kubelet/plugins/vxflexos.emc.dell.com/disks/12dbccc1024d0a0f-72a87c1b00000006: wrong fs type, bad option, bad superblock on /dev/scinia, missing codepage or helper program, or other error.

I tried testing with CSI test samples ( .starttest.sh 2vols) Even that throws the same message.
On the backend PowerFlex storage, I do see volumes being created. During the mount, the operation gets this error.
The default storage controller is set to vxflexos.

Any suggestion or workaround should help me

[BUG]: CSM Authorization unit test and gosec failures

Describe the bug
CSM Authorization has sporadic unit test failures when validating user-defined port ranges.
gosec action has false positive security alerts in GitHub actions

To Reproduce
Steps to reproduce the behavior:

  1. Run go test and gosec ./...
    ...
    n. Step n See error

Expected behavior
Both actions should pass

Screenshots
If applicable, add screenshots to help explain your problem.

Logs
If applicable, submit logs or stack traces from the affected services

System Information (please complete the following information):

  • OS/Version: [e.g. RHEL 7.6]
  • Kubernetes Version [e.g. 1.18]
  • Additional Information...

Additional context
Add any other context about the problem here.

[BUG]: CSM Authorization: rpm install error with locating Makefile

Describe the bug
CSM Authorization 1.0.0. rpm run in to an issue with Makefile during install.

To Reproduce
Download and install the .rpm package from the 1.0.0 release for CSM Authorization:
rpm -ivh karavi-authorization-1.0-0.x86_64.rpm
...
error: failed to read Makefile: open /viewsvn/lglbg082/jenkins/workspace/Ecosystems_Novus/Karavi_Authorization/karavi-authorization-release/karavi-authorization/Makefile: no such file or directory

Expected behavior
rpm install should not have any failures

System Information (please complete the following information):

  • OS/Version: CentOS 7
  • Kubernetes Version 1.18

[FEATURE]: Simplify access to karavictl binary for both Kubernetes and storage admins

Description

The karavictl binary is currently packaged within the CSM Authorization rpm that deploys the Authorization server. This currently does not support different host operating systems. There is also complexity around getting the binary from the Authorization server onto the Kubernetes access host. In an effort to avoid having to build/scp the binary and make it available for various host operating systems, the karavictl binary will be made availabe from the GitHub releases page for the following host operating systems.

  • darwin-amd64
  • linux-amd64
  • linux-arm64
  • windows-amd64.exe

Note: Feature created as a result of https://github.com/dell/karavi-authorization/issues/98.

[FEATURE]:CSM reporting metrics for PV/PVC for Unity

As a K8s admin, I want to view the pvc / pv metrics reporting back to kubelet via the CSI driver.

In addition, I want to verify that the volume is still provisioned properly on the array (exists and is exported to the node) and that it is still accessible and correctly mounted at the indicated volume path.

[BUG]:Extra VolumeSnapshots were created with VolumeGroup Snapshot.

Extra VolumeSnapshots were created with VolumeGroup Snapshot.
We don't see this problem on all VolumeGroup Snapshots.

Steps to reproduce the behavior:

  1. Using dellemc/csi-volumegroup-snapshotter:v0.2.0 to create VolumeGroup Snapshot of 3 PVCs with same label "volume-group=volumeGroup1"

  2. Run VolumeGroup Snapshot:
    apiVersion: volumegroup.storage.dell.com/v1alpha2
    kind: DellCsiVolumeGroupSnapshot
    metadata:
    name: "vg1-snap1"
    namespace: "helmtest-vxflexos"
    spec:
    driverName: "csi-vxflexos.dellemc.com"
    memberReclaimPolicy: "Delete"
    volumesnapshotclass: "vxflexos-snapclass"
    pvcLabel: "volumeGroup1"

  3. Get the result and found there are 6 VolumeSnapshot instead of 3.
    phuong@irv-vm-90-22 ~/work/vproxy-kubernetes (PPDM-159574-create-volumesnapshot) # k get dellcsivolumegroupsnapshots.volumegroup.storage.dell.com -n helmtest-vxflexos vg1-snap1 -o yaml
    apiVersion: volumegroup.storage.dell.com/v1alpha2
    kind: DellCsiVolumeGroupSnapshot
    metadata:
    annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
    {"apiVersion":"volumegroup.storage.dell.com/v1alpha2","kind":"DellCsiVolumeGroupSnapshot","metadata":{"annotations":{},"name":"vg1-snap1","namespace":"helmtest-vxflexos"},"spec":{"driverName":"csi-vxflexos.dellemc.com","memberReclaimPolicy":"Delete","pvcLabel":"volumeGroup1","volumesnapshotclass":"vxflexos-snapclass"}}
    creationTimestamp: "2021-10-07T21:28:43Z"
    finalizers:

  • vgFinalizer
    generation: 1
    name: vg1-snap1
    namespace: helmtest-vxflexos
    resourceVersion: "17321187"
    uid: b540018a-d032-4ed2-b808-27e43007d521
    spec:
    driverName: csi-vxflexos.dellemc.com
    memberReclaimPolicy: Delete
    pvcLabel: volumeGroup1
    volumesnapshotclass: vxflexos-snapclass
    status:
    creationTime: "2021-10-07T21:27:58Z"
    readyToUse: true
    snapshotGroupID: 217b52c106138b0f-c56b7c7b00000003
    snapshotGroupName: vg1-snap1-100721-212844
    snapshots: vg1-snap1-100721-212844-0-pvol1,vg1-snap1-100721-212844-1-pvol0,vg1-snap1-100721-212844-2-pvol2
    status: Complete
    phuong@irv-vm-90-22 ~/work/vproxy-kubernetes (PPDM-159574-create-volumesnapshot) # k get volumesnapshot -n helmtest-vxflexosNAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE
    vg1-snap1-100721-212843-0-pvol1 true snapcontent-0fa08d6e-62f8-44ef-860d-54d565da49fc 8Gi vxflexos-snapclass snapcontent-0fa08d6e-62f8-44ef-860d-54d565da49fc 56s 9s
    vg1-snap1-100721-212843-1-pvol0 true snapcontent-930aba6b-758b-4e66-9df1-c00a208d3b1e 8Gi vxflexos-snapclass snapcontent-930aba6b-758b-4e66-9df1-c00a208d3b1e 56s 9s
    vg1-snap1-100721-212843-2-pvol2 true snapcontent-57f03380-9715-40f5-a5de-25ec80f74a8e 8Gi vxflexos-snapclass snapcontent-57f03380-9715-40f5-a5de-25ec80f74a8e 56s 9s
    vg1-snap1-100721-212844-0-pvol1 true snapcontent-e5e9a2e0-caa7-4020-8a06-432ba4d27f60 8Gi vxflexos-snapclass snapcontent-e5e9a2e0-caa7-4020-8a06-432ba4d27f60 55s 9s
    vg1-snap1-100721-212844-1-pvol0 true snapcontent-61dc29f7-dddf-4c0a-9b0b-fb957188924d 8Gi vxflexos-snapclass snapcontent-61dc29f7-dddf-4c0a-9b0b-fb957188924d 55s 9s
    vg1-snap1-100721-212844-2-pvol2 true snapcontent-2ecfdd38-fa4a-4bb5-ab9e-0347b3b3fc78 8Gi vxflexos-snapclass snapcontent-2ecfdd38-fa4a-4bb5-ab9e-0347b3b3fc78 55s 9s
    phuong@irv-vm-90-22 ~/work/vproxy-kubernetes (PPDM-159574-create-volumesnapshot) #

Expected behavior
Only 3 VolumeSnapshot should be created.

[FEATURE]: Migrate Role Management Logic to Authorization Server

Describe the solution you'd like
The role create command logic should move from karavictl to the authorization server

Describe alternatives you've considered
Leaving the logic in the ctl binary, which makes it less portable and bloated

Additional context
N/A

[FEATURE]: CSM Resiliency supports evacuation of pods during NoExecute taint on node

Describe the feature

The original Resiliency design refused to force delete pods if they are potentially doing I/O to the array. The customer's request is a valid one, although not quite as safe as the current behavior of resiliency. I would propose that we make this behavior an option, i.e:

  • A new configuration variable is introduced. "forceDeleteOnNoExecuteTaint" or something similar, that would change the current behavior so that if podmon receives a notification that the pods is Not Ready and the node has a "NoExecute" taint, we force delete the pod. The default would be false, resulting in the current behavior.
  • If the pod has a grace period toleration for NoExecute (as most do, the default is 5 minutes), we need to do it before the grace period expires so that no replacement pods will be created on the same node being evacuated because of pod affinity. You could do it immediately upon receiving the NoExecute notification, or perhaps wait 1/2 the duration of the toleration (normally 2-3 minutes) in case the node becomes ready rather quickly.

Additional context
This feature has been converted from a bug. Logs have been attached.

session-logs.txt

[QUESTION]:Long time to bound PVC

Hi.

Latest driver installed with Helm on OpenShift 4.8 and Isilon storage.
When creating a PVC the time until it get bound seams long, about 35 sec.
Any suggestions to improve this?
Some output in the log when a PVC is created attached.

Regards,
-Ulf

[DEBUG]
-------------------------- GOISILON HTTP RESPONSE -------------------------
HTTP/1.1 404 Not Found
Transfer-Encoding: chunked
Allow: GET, PUT, POST, DELETE, HEAD
Content-Security-Policy: default-src 'none'
Content-Type: application/json
Date: Thu, 21 Oct 2021 06:48:50 GMT
Server: Apache
Strict-Transport-Security: max-age=31536000;
X-Frame-Options: sameorigin
X-Isi-Ifs-Spec-Version: 1.0

[DEBUG] Error in response. Method:GET URI:namespace/ifs/dst002dc50/nfs_dcc/ocp_test Error: Unable to open object 'dst002dc50/nfs_dcc/ocp_test/k8s-de8dfb1546' in store 'ifs' -- not found. JSON Error: Unable to open object 'dst002dc50/nfs_dcc/ocp_test/k8s-de8dfb1546' in store 'ifs' -- not found.
[DEBUG] the query of volume (id '', name 'k8s-de8dfb1546') returned an error, regard the volume as non-existent. error : 'Unable to open object 'dst002dc50/nfs_dcc/ocp_test/k8s-de8dfb1546' in store 'ifs' -- not found.'
time="2021-10-21T06:48:50Z" level=debug clusterName=dc50nfsdcc runid=19 msg="volume (id :'', name 'k8s-de8dfb1546') already exists : 'false'" file="/go/src/service/isiService.go:412"
time="2021-10-21T06:48:50Z" level=debug clusterName=dc50nfsdcc runid=19 msg="begin getting export with target path '/ifs/dst002dc50/nfs_dcc/ocp_test/k8s-de8dfb1546' and access zone 'dc50nfsdcc' for Isilon" file="/go/src/service/isiService.go:633"

[DEBUG] Execution successful on Method: GET, URI: platform/2/protocols/nfs/exports
time="2021-10-21T06:48:50Z" level=error clusterName=dc50nfsdcc runid=19 msg="error retrieving export ID for 'k8s-de8dfb1546', set it to 0. error : ''.
" file="/go/src/service/controller.go:320"
time="2021-10-21T06:48:50Z" level=error clusterName=dc50nfsdcc runid=19 msg="request parameters: the path is '/ifs/dst002dc50/nfs_dcc/ocp_test/k8s-de8dfb1546', and the access zone is 'dc50nfsdcc'." file="/go/src/service/controller.go:321"
time="2021-10-21T06:48:50Z" level=debug clusterName=dc50nfsdcc runid=19 msg="create volume with header metadata 'k8s-de8dfb1546' has been resolved to 'map[x-csi-pv-claimname:dxy x-csi-pv-name:k8s-de8dfb1546 x-csi-pv-namespace:fredrik]'" file="/go/src/service/controller.go:345"
time="2021-10-21T06:48:50Z" level=debug clusterName=dc50nfsdcc runid=19 msg="begin to create volume 'k8s-de8dfb1546'" file="/go/src/service/isiService.go:102"
time="2021-10-21T06:48:50Z" level=debug clusterName=dc50nfsdcc runid=19 msg="header metadata 'map[x-csi-pv-claimname:dxy x-csi-pv-name:k8s-de8dfb1546 x-csi-pv-namespace:fredrik]'" file="/go/src/service/isiService.go:103"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.