Giter Club home page Giter Club logo

Comments (11)

rootfs avatar rootfs commented on August 26, 2024

this is the problem

librados: client.admin authentication error (95) Operation not supported
[errno 95] error connecting to the cluster

At the time the user is to be deleted, the volume should have already be deleted, confirming the credential and cluster are working.

@mkimuram what ceph cluster you are using? could you please get the ceph mon logs and see if there is any message about deleting user?

@gman0 have you seen this error before?

from ceph-csi.

gman0 avatar gman0 commented on August 26, 2024

@rootfs no. Indeed, ceph logs could help.

from ceph-csi.

mkimuram avatar mkimuram commented on August 26, 2024

@gman0 @rootfs

what ceph cluster you are using?

I'm using ceph docker image used in e2e testing.
(Created by using https://github.com/kubernetes/kubernetes/tree/master/test/images/volumes-tester/rbd .)

could you please get the ceph mon logs and see if there is any message about deleting user?

Logs are extracted below way. Please check it.
ceph_logs.txt

# kubectl exec -it ceph-server -n e2e-tests-csi-mock-plugin-6hphk -- find /var/log/ceph/ -name "*log"
/var/log/ceph/ceph-osd.1.log
/var/log/ceph/ceph-mon.a.log
/var/log/ceph/ceph-client.admin.log
/var/log/ceph/ceph.log
/var/log/ceph/ceph-osd.0.log
/var/log/ceph/ceph.audit.log
/var/log/ceph/ceph-mds.cephfs.log
# kubectl exec -it ceph-server -n e2e-tests-csi-mock-plugin-6hphk -- find /var/log/ceph/ -name "*log" -print -exec cat {} \; > /tmp/ceph_logs.txt

from ceph-csi.

rootfs avatar rootfs commented on August 26, 2024

there is no auth rm in ceph-audit.log, last msgs are:

2018-07-19 18:20:48.820491 mon.0 172.17.0.3:6789/0 78 : audit [INF] from='client.? 172.17.0.3:0/386396941' entity='client.admin' cmd=[{"prefix": "fs new", "data": "cephfs_data", "fs_name": "cephfs", "metadata": "cephfs_metadata"}]: dispatch
2018-07-19 18:20:51.959515 mon.0 172.17.0.3:6789/0 81 : audit [INF] from='client.? 172.17.0.3:0/386396941' entity='client.admin' cmd='[{"prefix": "fs new", "data": "cephfs_data", "fs_name": "cephfs", "metadata": "cephfs_metadata"}]': finished

from ceph-csi.

rootfs avatar rootfs commented on August 26, 2024

@mkimuram can you post the relevant e2e test console messages too? I wonder if the ceph cluster was (in the process of) teardown before the PV was deleted in the background.

from ceph-csi.

rootfs avatar rootfs commented on August 26, 2024

@mkimuram this is what message i would look for to ensure the PV is deleted before the cluster is teared down.
https://github.com/mkimuram/kubernetes/blob/b461cb7acad0a0912fda198b8df46cd5ae452f0a/test/e2e/storage/volume_provisioning.go#L166

from ceph-csi.

mkimuram avatar mkimuram commented on August 26, 2024

@rootfs

I run e2e test again with saving console log(e2e-console.txt) and wait until "Waiting up to 20m0s for PersistentVolume pvc-xxx to get deleted" message shows up.
After that, I got the logs (all_logs.txt) with the below script.

However, I still can't find auth rm.
This time, I also include logs for csi-external-provisioner and cephfs-driver for the relevant execution.
I hope this will be of your help.

# cat getlog.sh 
#! /bin/bash -x 
date
kubectl logs ceph-server -n $namespace
kubectl logs csi-cephfs-controller-0 -n $namespace -c csi-external-provisioner
kubectl logs $cephfsnode -n $namespace -c cephfs-driver
kubectl exec -it ceph-server -n $namespace -- find /var/log/ceph/ -name "*log"
kubectl exec -it ceph-server -n $namespace -- find /var/log/ceph/ -name "*log" -print -exec cat {} \;

# export namespace=e2e-tests-csi-mock-plugin-9sbdt
# export cephfsnode=csi-cephfs-node-jkxxx
# ./getlog.sh > all_logs.txt 2>&1

from ceph-csi.

rootfs avatar rootfs commented on August 26, 2024

right, the timeline matches: in CSI logs, delete volume happened before cluster was teared down.

from ceph-csi.

mkimuram avatar mkimuram commented on August 26, 2024

Sharing logs with debug messages:
all_logs5.txt
e2e-console5.txt

from ceph-csi.

rootfs avatar rootfs commented on August 26, 2024

This is what happened.

The e2e test just creates/deletes volume, it doesn't publish the volume, thus the volume is never mounted during the test.

However, the cephfs controller service doesn't create the user; the node service creates it here. So by the time volume is to be deleted by controller service, the user doesn't exist.

@gman0

from ceph-csi.

gman0 avatar gman0 commented on August 26, 2024

@rootfs very nice find, thanks! I'll move
this to NodeStageVolume/NodeUnstageVolume.

from ceph-csi.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.