Need to verify that owner references with "block owner deletion" flag actually work as

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Apparently kubectl doesn't use <code class="notransla

So about that kubectl change: Make a two line change here: <a href="https://github

Bundle should not be deleted while it owns some resources about smith HOT 12 CLOSED

atlassian commented on June 3, 2024

Bundle should not be deleted while it owns some resources

from smith.

Comments (12)

ash2k commented on June 3, 2024 1

Another option would be to have a mutating admission hook that rewrites deletion requests to do foreground deletion. Seems like the easiest option. I don't want to spend time re-implementing Kube GC in Smith, especially if it works just fine.

from smith.

nilebox commented on June 3, 2024

@ash2k comment for BlockOwnerDeletion says the following:

	// If true, AND if the owner has the "foregroundDeletion" finalizer, then
	// the owner cannot be deleted from the key-value store until this
	// reference is removed.
	// Defaults to false.
	// To set this field, a user needs "delete" permission of the owner,
	// otherwise 422 (Unprocessable Entity) will be returned.
	// +optional
	BlockOwnerDeletion *bool `json:"blockOwnerDeletion,omitempty" protobuf:"varint,7,opt,name=blockOwnerDeletion"`

i.e. Bundle needs to have a "foregroundDeletion" finalizer.
See https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/#foreground-cascading-deletion

The object’s metadata.finalizers contains the value “foregroundDeletion”.

In other words, we need to manually add a "foregroundDeletion" to the list of Bundle finalizers. And after that the Garbage Collector should delete all the objects respecting the dependency graph.

from smith.

nilebox commented on June 3, 2024

In order to make GC work for deletion of resources, we should also set an ownerReference to the owner Bundle, but currently we only add a label pointing to the Bundle, and owner references pointing to other resources the given resource depends on.

e.g. Binding depends on Instance, and has a corresponding ownerReference, but doesn't have an ownerReference pointing to Bundle:

apiVersion: servicecatalog.k8s.io/v1beta1
kind: ServiceBinding
metadata:
  creationTimestamp: 2018-02-13T04:21:35Z
  finalizers:
  - kubernetes-incubator/service-catalog
  generation: 1
  labels:
    smith.atlassian.com/BundleName: v-test
  name: showenv--sqs
  namespace: sctest
  ownerReferences:
  - apiVersion: servicecatalog.k8s.io/v1beta1
    blockOwnerDeletion: true
    kind: ServiceInstance
    name: sqs
    uid: 584cc40a-1075-11e8-8819-32d7bcc7b2c9
  resourceVersion: "1302"
  selfLink: /apis/servicecatalog.k8s.io/v1beta1/namespaces/sctest/servicebindings/showenv--sqs
  uid: 60a01f18-1075-11e8-8819-32d7bcc7b2c9

from smith.

nilebox commented on June 3, 2024

I'll pick up this task and try to fix it using foregroundDeletion.

from smith.

nilebox commented on June 3, 2024

I made an experiment to check the behavior of deletion with blockOwnerDeletion.

---
apiVersion: v1
kind: Secret
metadata:
  name: root
  namespace: sctest
type: Opaque
---
apiVersion: v1
kind: Secret
metadata:
  name: middle
  namespace: sctest
  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: true
    controller: true
    kind: Secret
    name: root
    # replace UID with the actual one after create and update it
    uid: 8c9ed2c3-109c-11e8-ba18-0a517c7aa8f4
type: Opaque
---
apiVersion: v1
kind: Secret
metadata:
  name: leaf
  namespace: sctest
  finalizers: # should block deletion of parent?
  - nislamov/test-blocker
  ownerReferences:
  - apiVersion: v1
    blockOwnerDeletion: true
    controller: true
    kind: Secret
    name: middle
    # replace UID with the actual one after create and update it
    uid: 8cba4a7a-109c-11e8-ba18-0a517c7aa8f4
type: Opaque

created 3 secrets (root, middle and leaf)
marked them with ownerReferences (root <- middle <- leaf) with blockOwnerDeletion: true
marked the leaf secret with nislamov/test-blocker finalizer (which should prevent this object from deletion)
deleted the root resource:

❯ k delete secret -n sctest root
secret "root" deleted

Result: root and middle secrets are gone, leaf secret is marked with deletionTimestamp, but still exists (because of custom finalizer).

from smith.

nilebox commented on June 3, 2024

Apparently kubectl doesn't use propagationPolicy: Foreground yet, I guess that's the source of the problem: kubernetes/kubernetes#46659
And that's why documentation provides an example of passing propagationPolicy: Foreground as a raw POST request instead of kubectl delete command: https://kubernetes.io/docs/concepts/workloads/controllers/garbage-collection/#setting-the-cascading-deletion-policy.

kubectl currently does some custom client-side cascade deletion, not even sure how reliable it is (whether it always marks all child objects with deletionTimestamp or not).
It might also describe why I've seen ServiceInstances marked for deletion while ServiceBinding not - it might be that kubectl just gives up once it sees that the instance is marked with custom finalizer. I will test this case too.

from smith.

nilebox commented on June 3, 2024

So most probably the GC foreground deletion works fine when you delete the Bundle through API call with propagationPolicy: Foreground, but when you delete it using kubectl delete it will work the way I described above.

I guess the only reliable way for now (until the kubectl issue is fixed) is to mark all the objects with Smith's finalizer and perform the cascade deletion manually (in the Smith code). After the kubectl issue is resolved, we can delete this code.

from smith.

nilebox commented on June 3, 2024

As I predicted above, I have reproduced the issue we have seen with instances and bindings.
Took the example above and marked the middle object with custom finalizer (instead of leaf as previously). Deleted root again. Result: root is gone, middle is marked with deletionTimestamp, leaf NOT marked with deletionTimestamp.

In the situation with ServiceInstance and ServiceBinding we see instances marked with deletionTimestamp and bindings NOT marked with deletionTimestamp. Since instance controller waits until all bindings are deleted, this leads to both instances and bindings getting stuck.

We should either stop using kubectl for deleting Bundle and write some scripts for sending raw deletion requests for now, or implement manual deletion of children on the Smith side.

We could send a PR to fix the issue in kubectl as well :)

from smith.

ash2k commented on June 3, 2024

Thanks for investigation btw.

from smith.

amckague commented on June 3, 2024

So about that kubectl change:
Make a two line change here: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubectl/cmd/delete.go#L309

index 80a9bd4992..94859466cb 100644
--- a/pkg/kubectl/cmd/delete.go
+++ b/pkg/kubectl/cmd/delete.go
@@ -312,8 +312,8 @@ func DeleteResult(r *resource.Result, f cmdutil.Factory, out io.Writer, ignoreNo
 }
 
 func cascadingDeleteResource(info *resource.Info, f cmdutil.Factory, out io.Writer, shortOutput bool, mapper meta.RESTMapper) 
-       falseVar := false
-       deleteOptions := &metav1.DeleteOptions{OrphanDependents: &falseVar}
+       propagationPolicy := metav1.DeletePropagationForeground
+       deleteOptions := &metav1.DeleteOptions{PropagationPolicy: &propagationPolicy}
        return deleteResource(info, f, out, shortOutput, mapper, deleteOptions)
 }

Manual build with
WHAT="--use_go_build cmd/kubectl" make
and run with
./_output/bin/kubectl -v10 delete secret root (raising the logging makes kubectl print out equivalent curl commands and response bodies)
Against the example with middle being stuck just works. (leaf is deleted, middle is marked for deletion and goes away when you remove the finalizer)

from smith.

nilebox commented on June 3, 2024

@amckague thanks for the code snippet.
There's actually more places in kubectl that pass OrphanDependents: &false - there are "reapers" implementing deletion for specific resource types (Deployments, ReplicaSets etc).

I have analyzed the implementation of OrphanDependents: false vs PropagationPolicy: Foreground and found this:
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/garbagecollector/garbagecollector.go#L530

// attemptToOrphanWorker dequeues a node from the attemptToOrphan, then finds its
// dependents based on the graph maintained by the GC, then removes it from the
// OwnerReferences of its dependents, and finally updates the owner to remove
// the "Orphan" finalizer. The node is added back into the attemptToOrphan if any of
// these steps fail.

Which explains why we see such confusing behavior with --cascade=true.
Agreed that the change mentioned above should fix the problem.

from smith.

ash2k commented on June 3, 2024

@nilebox can we close this?

from smith.

Bundle should not be deleted while it owns some resources about smith HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent