What happened : We had a unique failure where cor

DNS errors cause global-rate-limit-memcached-connect-timeout to not be respected. about ingress-nginx HOT 11 OPEN

alphabet5 commented on June 8, 2024

DNS errors cause global-rate-limit-memcached-connect-timeout to not be respected.

from ingress-nginx.

Comments (11)

k8s-ci-robot commented on June 8, 2024

This issue is currently awaiting triage.

If Ingress contributors determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from ingress-nginx.

longwuyuan commented on June 8, 2024

/remove-kind bug
/kind support
/triage needs-information

1- dns should hit the service and not the pod ips so when they're replaced dns doesn't just timeout. (I haven't figured this one out yet.)

What was the ipaddress of the coredns pods before and after they/it crashed/replaced
What do you mean dns here ? Do you mean dns resolution queries for resolving the backend-app pod ipaddress ?
What do yo mean by service ? Is it the backend-app service configured in the ingress resource ?
What do you mean pod ips here ? Do you mean the pod ips of the backend app ?

from ingress-nginx.

alphabet5 commented on June 8, 2024

The pod doesn't have to crash, any dropped dns queries to resolve the memcached host will cause the configured timeout to not be followed. (Assuming the memcached timeout is < dns retry interval.)

This is an example:

coredns pod ip

~ % kubectl get pod -n kube-system -o wide | grep coredns
rke2-coredns-rke2-coredns-76d7556c69-lhr4h             1/1     Running   0                10m     10.64.2.210

Delete the pod, or rollout restart the deployment, and the new pod will get a new IP.

~ % kubectl delete pod -n kube-system rke2-coredns-rke2-coredns-76d7556c69-lhr4h
pod "rke2-coredns-rke2-coredns-76d7556c69-lhr4h" deleted
~ % kubectl get pod -n kube-system -o wide | grep coredns
rke2-coredns-rke2-coredns-76d7556c69-q7prt             1/1     Running   0                53s     10.64.2.51

By DNS - I mean the DNS query that ingress-nginx uses to resolve the global-rate-limit-memcached-host to an IP address.

By service I mean the coredns service, in this case 10.65.0.10 (10.65/16 is this example clusters service range. 10.64/16 is the pod cidr range, with /24's being given to each node to be assigned to pods.)

~ % kubectl get svc -n kube-system rke2-coredns-rke2-coredns
NAME                        TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)         AGE
rke2-coredns-rke2-coredns   ClusterIP   10.65.0.10   <none>        53/UDP,53/TCP   61d

Now connections will be kept alive to memcached, but if the memcached deployment also gets an update or pods get rescheduled, ingress-nginx will try to reconnect and query coredns for the IP.

When this happens (only with queries to ingress objects using global-rate-limit) this shows up in the hubble/cilum logs.

ingress-nginx/ingress-nginx-controller-74rb6:53239 (ID:28807) <> 10.64.2.210:53 (world) Stale or unroutable IP DROPPED (UDP)

(Notice how this is to a coredns pod IP and not to the coredns service address.)

Then after the dns query to resolve the global-rate-limit-memcached-host times out, the request to the backend pod happens as normal - so every request is delayed by an extra 10s.

I haven't figured out why this happens, and querying for the memcached hostname from within the nginx pod resolves correctly.

~ % kubectl exec --stdin --tty -n ingress-nginx ingress-nginx-controller-74rb6 -- bash
ingress-nginx-controller-74rb6:/etc/nginx$ nslookup memcached.ingress-nginx.svc.cluster.local.
Server:		10.65.0.10
Address:	10.65.0.10:53


Name:	memcached.ingress-nginx.svc.cluster.local
Address: 10.65.7.46

This suggests to me that it's something within ingress-nginx/openresty/lua that is doing something weird with dns, but I haven't figured that part out yet. (It could also be some weird issue with cilium/ebpf.)

None of these specifics really matter, because regardless of the underlying issue, it shouldn't cause a 10s delay to requests when the timeout is configured to be 50ms.

from ingress-nginx.

longwuyuan commented on June 8, 2024

Thanks for explaining.

But none of your data is actual live status from output of commands so I personally don't see much relevance pointing to a problem in the controller like the suggestion that you configured a 50ms timeout for something, somewhere in the controller and the controller did not honor that.

There is dns cache involved
There is arp involved
I dont know where you configured a 50ms timeout for ingress-nginx-controller pod to resolve a memcached pod's hostname to a ipaddress
And I suspect the controller you have running is not even a image built/released by this project

But these are my opinions. Wait for comments from experts who agree that a user breaking coredns showcases a bug in the ingress-controller image you are running, which may not be a image released by this project.

from ingress-nginx.

longwuyuan commented on June 8, 2024

Also, if the coredns pods crashed and recreated, is there a hint here that the ipaddress of the memcached pod changed ?

I wish you get to the bottom of this issue and solve future occurrence of this break in rate-limiting

from ingress-nginx.

alphabet5 commented on June 8, 2024

This is using the ingress-nginx helm chart. You can see the output of nginx-ingress-controller --version in the description.

NGINX Ingress controller
Release: v1.10.1
Build: https://github.com/kubernetes/ingress-nginx/commit/4fb5aac1dd3669daa3a14d9de3e3cdb371b4c518
Repository: https://github.com/kubernetes/ingress-nginx
nginx version: nginx/1.25.3

You can see the helm chart values I used to reproduce this issue in the description. The only thing configured is the memcached host.

controller:
  global-rate-limit-memcached-host: memcached.ingress-nginx.svc.cluster.local.

The 50ms timeout is the default, as shown here:
https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/configmap/#global-rate-limit

global-rate-limit-memcached-connect-timeout: configure timeout for connect, send and receive operations. Unit is millisecond. Defaults to 50ms.

DNS cache is implementation specific, If you have more information on the lua resolver dns cache implementation, that would be helpful. As indicated by my nslookup command from within the nginx controller pod, the pod can resolve it correctly, and uses the coredns service address to do so, but based on the traffic captured the lua code for the global-rate-limit does not.

This does not require blocking traffic to coredns, or coredns crashing. Coredns was updated, and how updates work is by creating a new pod, waiting for it to become ready, the service/endpoint slices get updated, and then the old pods are removed.

from ingress-nginx.

longwuyuan commented on June 8, 2024

Thanks for your comments.

If I was to summarize the problem, it would like below.

You are reporting that after a crash/recreate of the coredns pods in a cluster, it takes 10s for ingress-nginx-controller to resolve the memcached workload service to a memcached pod ipaddress, and this 10s is a blind-spot for the rate-limit annotation because it can NOT get the counters in those 10seconds.

You are expecting that it should take only 50ms, after the crash/recreate of coredns pods in a cluster, for ingress-nginx to resolve & query the memcached workload and get the rate-limit-counters, for the rate-limit-annotation.

Is this correct ?

from ingress-nginx.

longwuyuan commented on June 8, 2024

I deleted coredns pod in one terminal and I did a nslookup for a service resource in a bash while loop with sleep for 1 second. I don't see even 1 nslookup failure

from ingress-nginx.

alphabet5 commented on June 8, 2024

2 issues - issue 1 causes issue 2.

DNS resolution from within the global-rate-limit code-path does not use the service ip, and does not reach coredns if the pod IPs change. This does not apply to nslookup from within the ingress-nginx pod, or to resolving the backend pods for an ingress. (I believe this uses the k8s api instead of dns, but regardless.)

Any dns lookup failure (i.e. timeout, not an immediate NXDOMAIN) will cause traffic through ingress-nginx to be delayed by 10s, even when the timeout configured for the global-rate-limit is set to 50ms.

Here's an easier way you can reproduce issue 2. This is showing both the input (commands I run) and the output. I'll put just the commands down below.

user@computer ~ % kind create cluster
Creating cluster "kind" ...
 ✓ Ensuring node image (kindest/node:v1.29.2) 🖼
 ✓ Preparing nodes 📦
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
Set kubectl context to "kind-kind"
You can now use your cluster with:

kubectl cluster-info --context kind-kind

Not sure what to do next? 😅  Check out https://kind.sigs.k8s.io/docs/user/quick-start/
user@computer ~ % helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx --create-namespace \
  --set controller.config.global-rate-limit-memcached-host=asdf.blackhole.alphabet5.dev
Release "ingress-nginx" does not exist. Installing it now.
NAME: ingress-nginx
LAST DEPLOYED: Fri May  3 12:21:37 2024
NAMESPACE: ingress-nginx
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
The ingress-nginx controller has been installed.
It may take a few minutes for the load balancer IP to be available.
You can watch the status by running 'kubectl get service --namespace ingress-nginx ingress-nginx-controller --output wide --watch'

An example Ingress that makes use of the controller:
  apiVersion: networking.k8s.io/v1
  kind: Ingress
  metadata:
    name: example
    namespace: foo
  spec:
    ingressClassName: nginx
    rules:
      - host: www.example.com
        http:
          paths:
            - pathType: Prefix
              backend:
                service:
                  name: exampleService
                  port:
                    number: 80
              path: /
    # This section is only required if TLS is to be enabled for the Ingress
    tls:
      - hosts:
        - www.example.com
        secretName: example-tls

If TLS is enabled for the Ingress, a Secret containing the certificate and key must also be provided:

  apiVersion: v1
  kind: Secret
  metadata:
    name: example-tls
    namespace: foo
  data:
    tls.crt: <base64 encoded cert>
    tls.key: <base64 encoded key>
  type: kubernetes.io/tls
user@computer ~ % cat <<EOF > manifests.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dummy
spec:
  selector:
    matchLabels:
      app: dummy
  replicas: 1
  template:
    metadata:
      labels:
        app: dummy
    spec:
      containers:
      - name: podinfo
        image: stefanprodan/podinfo
        ports:
        - containerPort: 9898
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: dummy
  annotations:
    nginx.ingress.kubernetes.io/global-rate-limit: "1000000"
    nginx.ingress.kubernetes.io/global-rate-limit-window: "1m"
spec:
  ingressClassName: nginx
  rules:
  - host: "localhost"
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: dummy
            port:
              number: 9898
---
apiVersion: v1
kind: Service
metadata:
  name: dummy
spec:
  selector:
    app: dummy
  ports:
    - protocol: TCP
      port: 9898
      targetPort: 9898
EOF
user@computer ~ % kubectl apply -f manifests.yaml
deployment.apps/dummy created
ingress.networking.k8s.io/dummy created
service/dummy created
user@computer ~ % sudo kubectl port-forward -n ingress-nginx svc/ingress-nginx-controller 80:80
user@computer ~ % Forwarding from 127.0.0.1:80 -> 80
Forwarding from [::1]:80 -> 80

user@computer ~ % time curl http://localhost
Handling connection for 80
user@computer ~ % time curl http://localhost
Handling connection for 80
{
  "hostname": "dummy-749c8b54cd-js75j",
  "version": "6.6.2",
  "revision": "8d010c498e79f499d1b37480507ca1ffb81a3bf7",
  "color": "#34577c",
  "logo": "https://raw.githubusercontent.com/stefanprodan/podinfo/gh-pages/cuddle_clap.gif",
  "message": "greetings from podinfo v6.6.2",
  "goos": "linux",
  "goarch": "arm64",
  "runtime": "go1.22.2",
  "num_goroutine": "6",
  "num_cpu": "4"
}curl http://localhost  0.00s user 0.00s system 0% cpu 6.028 total

kind create cluster
helm upgrade --install ingress-nginx ingress-nginx \
  --repo https://kubernetes.github.io/ingress-nginx \
  --namespace ingress-nginx --create-namespace \
  --set controller.config.global-rate-limit-memcached-host=asdf.blackhole.alphabet5.dev
cat <<EOF > manifests.yaml
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: dummy
spec:
  selector:
    matchLabels:
      app: dummy
  replicas: 1
  template:
    metadata:
      labels:
        app: dummy
    spec:
      containers:
      - name: podinfo
        image: stefanprodan/podinfo
        ports:
        - containerPort: 9898
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: dummy
  annotations:
    nginx.ingress.kubernetes.io/global-rate-limit: "1000000"
    nginx.ingress.kubernetes.io/global-rate-limit-window: "1m"
spec:
  ingressClassName: nginx
  rules:
  - host: "localhost"
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: dummy
            port:
              number: 9898
---
apiVersion: v1
kind: Service
metadata:
  name: dummy
spec:
  selector:
    app: dummy
  ports:
    - protocol: TCP
      port: 9898
      targetPort: 9898
EOF


sudo kubectl port-forward -n ingress-nginx svc/ingress-nginx-controller 80:80
time curl http://localhost

If you modify the memcached host to something that doesn't exist, the timeout works as expected. (i.e. if the failure is not dns related, and is the socket connect)

from ingress-nginx.

alphabet5 commented on June 8, 2024

For this last reproduction, the time is not exactly 10s because the SERVFAIL time will vary based on your upstream dns servers. If lua can't connect to your dns server, it will always be 10s. (Because it's configured for 5x retries, 2s timeout.)

time nslookup asdf.blackhole.alphabet5.dev 1.1.1.1
Server:		1.1.1.1
Address:	1.1.1.1#53

** server can't find asdf.blackhole.alphabet5.dev: SERVFAIL


real	0m4.079s
user	0m0.005s
sys	0m0.005s

time nslookup asdf.blackhole.alphabet5.dev 8.8.8.8
;; communications error to 8.8.8.8#53: timed out
;; communications error to 8.8.8.8#53: timed out
;; communications error to 8.8.8.8#53: timed out
;; no servers could be reached



real	0m15.131s
user	0m0.004s
sys	0m0.008s

from ingress-nginx.

longwuyuan commented on June 8, 2024

Hi @alphabet5 , thank you very much for the updated info you provided. Appreciate the time taken to give details.

These 2 message from you highlighted below , seem to be the key to issue you are reporting.

NS resolution from within the global-rate-limit code-path does not use the service ip, and does not reach coredns if the pod IPs change.

and this message

This does not apply to nslookup from within the ingress-nginx pod, or to resolving the backend pods for an ingress. (I believe this uses the k8s api instead of dns, but regardless.)

I think it will help if you copy/paste the links to the code-snippets you are referring to.

Meanwhile I will try to look it up and see if I can understand the details of the 2 messages highlighted above.

from ingress-nginx.

DNS errors cause global-rate-limit-memcached-connect-timeout to not be respected. about ingress-nginx HOT 11 OPEN

Comments (11)

1- dns should hit the service and not the pod ips so when they're replaced dns doesn't just timeout. (I haven't figured this one out yet.)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent