openshift / master-dns-operator Goto Github PK
View Code? Open in Web Editor NEWOperator to manage DNS entries for master machines in an OpenShift cluster.
License: Apache License 2.0
Operator to manage DNS entries for master machines in an OpenShift cluster.
License: Apache License 2.0
This repo needs an OWNER file. It is unclear who to ping in urgent situations.
Priority classes docs:
https://docs.openshift.com/container-platform/3.11/admin_guide/scheduling/priority_preemption.html#admin-guide-priority-preemption-priority-class
Example: https://github.com/openshift/cluster-monitoring-operator/search?q=priority&unscoped_q=priority
Notes: The pre-configured system priority classes (system-node-critical
and system-cluster-critical
) can only be assigned to pods in kube-system
or openshift-*
namespaces. Most likely, core operators and their pods should be assigned system-cluster-critical
. Please do not assign system-node-critical
(the highest priority) unless you are really sure about it.
In my OKD cluster the Corefile in Node master-1 is faulty. Instead of a cluster external DNS resolver it has 127.0.0.53 in the forward declaration.
I am running OKD 4.9.0 IPI on vSphere 6.7:
[root@localhost ocp-install]# oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.9.0-0.okd-2021-11-28-035710 True False 45d Cluster version is 4.9.0-0.okd-2021-11-28-035710
One of my customers has the very same symptom (wrong Corefile on master-1) in their cluster and experiences very high CPU load (~2.3 cores) for this exact pod with frequent "i/o timeout" messages in coredns container logs.
When manually correcting the Corefile by replacing 127.0.0.53 with an actual DNS resolver IP (in my case 10.1.0.1), these messages disappear and the cpu load normalized to 0.002 cores.
Related to okd-project/okd/issues/978.
Pod logs of master-1 coredns-monitor shows that its runtimecfg util is rendering a faulty Corefile with 127.0.0.53
in forward rule.
$ oc logs coredns-lab4-h9zq6-master-1 coredns-monitor
time="2022-01-12T12:59:19Z" level=info msg="Runtimecfg rendering template" path=/etc/coredns/Corefile
time="2022-01-12T13:08:20Z" level=info msg="Node change detected, rendering Corefile" Node Addresses="[{10.1.2.189 lab4-h9zq6-master-0 false} {10.1.2.190 lab4-h9zq6-master-1 false} {10.1.2.188 lab4-h9zq6-master-2 false} {10.1.2.205 lab4-h9zq6-worker-dlr5x false} {10.1.2.203 lab4-h9zq6-worker-k8dfd false} {10.1.2.207 lab4-h9zq6-worker-m5lqk false} {10.1.2.209 lab4-h9zq6-worker-v95g9 false}]"
time="2022-01-12T13:08:20Z" level=info msg=". {"
time="2022-01-12T13:08:20Z" level=info msg=" errors"
time="2022-01-12T13:08:20Z" level=info msg=" bufsize 512"
time="2022-01-12T13:08:20Z" level=info msg=" health :18080"
time="2022-01-12T13:08:20Z" level=info msg=" forward . 127.0.0.53 {"
time="2022-01-12T13:08:20Z" level=info msg=" policy sequential"
time="2022-01-12T13:08:20Z" level=info msg=" }"
time="2022-01-12T13:08:20Z" level=info msg=" cache 30"
time="2022-01-12T13:08:20Z" level=info msg=" reload"
time="2022-01-12T13:08:20Z" level=info msg=" template IN A lab4.company.corp {"
time="2022-01-12T13:08:20Z" level=info msg=" match .*.apps.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" answer \"{{ .Name }} 60 in {{ .Type }} 10.1.4.2\""
time="2022-01-12T13:08:20Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:20Z" level=info msg=" }"
time="2022-01-12T13:08:20Z" level=info msg=" template IN AAAA lab4.company.corp {"
time="2022-01-12T13:08:20Z" level=info msg=" match .*.apps.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:20Z" level=info msg=" }"
time="2022-01-12T13:08:20Z" level=info msg=" template IN A lab4.company.corp {"
time="2022-01-12T13:08:20Z" level=info msg=" match api.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" answer \"{{ .Name }} 60 in {{ .Type }} 10.1.4.1\""
time="2022-01-12T13:08:20Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:20Z" level=info msg=" }"
time="2022-01-12T13:08:20Z" level=info msg=" template IN AAAA lab4.company.corp {"
time="2022-01-12T13:08:20Z" level=info msg=" match api.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:20Z" level=info msg=" }"
time="2022-01-12T13:08:20Z" level=info msg=" template IN A lab4.company.corp {"
time="2022-01-12T13:08:20Z" level=info msg=" match api-int.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" answer \"{{ .Name }} 60 in {{ .Type }} 10.1.4.1\""
time="2022-01-12T13:08:20Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:20Z" level=info msg=" }"
time="2022-01-12T13:08:20Z" level=info msg=" template IN AAAA lab4.company.corp {"
time="2022-01-12T13:08:20Z" level=info msg=" match api-int.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:20Z" level=info msg=" }"
time="2022-01-12T13:08:20Z" level=info msg=" hosts {"
time="2022-01-12T13:08:20Z" level=info msg=" 10.1.2.189 lab4-h9zq6-master-0 lab4-h9zq6-master-0.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" 10.1.2.190 lab4-h9zq6-master-1 lab4-h9zq6-master-1.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" 10.1.2.188 lab4-h9zq6-master-2 lab4-h9zq6-master-2.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" 10.1.2.205 lab4-h9zq6-worker-dlr5x lab4-h9zq6-worker-dlr5x.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" 10.1.2.203 lab4-h9zq6-worker-k8dfd lab4-h9zq6-worker-k8dfd.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" 10.1.2.207 lab4-h9zq6-worker-m5lqk lab4-h9zq6-worker-m5lqk.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" 10.1.2.209 lab4-h9zq6-worker-v95g9 lab4-h9zq6-worker-v95g9.lab4.company.corp"
time="2022-01-12T13:08:20Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:20Z" level=info msg=" }"
time="2022-01-12T13:08:20Z" level=info msg="}"
When I run the command of the coredns-monitor pod in the running container, it renders a correct configuration:
$ oc exec -it coredns-lab4-h9zq6-master-1 -c coredns-monitor -- bash
[root@lab4-h9zq6-master-1 /]# runtimecfg render --verbose /var/lib/kubelet/kubeconfig --api-vip 10.1.4.1 --ingress-vip 10.1.4.2 /config --out-dir /tmp/test/
INFO[0000] . {
INFO[0000] errors
INFO[0000] bufsize 512
INFO[0000] health :18080
INFO[0000] forward . 10.1.0.1 {
INFO[0000] policy sequential
INFO[0000] }
INFO[0000] cache 30
INFO[0000] reload
INFO[0000] template IN A lab4.company.corp {
INFO[0000] match .*.apps.lab4.company.corp
INFO[0000] answer "{{ .Name }} 60 in {{ .Type }} 10.1.4.2"
INFO[0000] fallthrough
INFO[0000] }
INFO[0000] template IN AAAA lab4.company.corp {
INFO[0000] match .*.apps.lab4.company.corp
INFO[0000] fallthrough
INFO[0000] }
INFO[0000] template IN A lab4.company.corp {
INFO[0000] match api.lab4.company.corp
INFO[0000] answer "{{ .Name }} 60 in {{ .Type }} 10.1.4.1"
INFO[0000] fallthrough
INFO[0000] }
INFO[0000] template IN AAAA lab4.company.corp {
INFO[0000] match api.lab4.company.corp
INFO[0000] fallthrough
INFO[0000] }
INFO[0000] template IN A lab4.company.corp {
INFO[0000] match api-int.lab4.company.corp
INFO[0000] answer "{{ .Name }} 60 in {{ .Type }} 10.1.4.1"
INFO[0000] fallthrough
INFO[0000] }
INFO[0000] template IN AAAA lab4.company.corp {
INFO[0000] match api-int.lab4.company.corp
INFO[0000] fallthrough
INFO[0000] }
INFO[0000] hosts {
INFO[0000] fallthrough
INFO[0000] }
INFO[0000] }
INFO[0000]
INFO[0000] Runtimecfg rendering template path=/tmp/test/Corefile
For comparison, this is what the logs tell me for coredns-monitor on the other masters. The configuration looks good.
$ oc logs coredns-lab4-h9zq6-master-0 coredns-monitor
time="2022-01-12T12:59:43Z" level=info msg="Runtimecfg rendering template" path=/etc/coredns/Corefile
time="2022-01-12T13:08:43Z" level=info msg="Node change detected, rendering Corefile" Node Addresses="[{10.1.2.189 lab4-h9zq6-master-0 false} {10.1.2.190 lab4-h9zq6-master-1 false} {10.1.2.188 lab4-h9zq6-master-2 false} {10.1.2.205 lab4-h9zq6-worker-dlr5x false} {10.1.2.203 lab4-h9zq6-worker-k8dfd false} {10.1.2.207 lab4-h9zq6-worker-m5lqk false} {10.1.2.209 lab4-h9zq6-worker-v95g9 false}]"
time="2022-01-12T13:08:43Z" level=info msg=". {"
time="2022-01-12T13:08:43Z" level=info msg=" errors"
time="2022-01-12T13:08:43Z" level=info msg=" bufsize 512"
time="2022-01-12T13:08:43Z" level=info msg=" health :18080"
time="2022-01-12T13:08:43Z" level=info msg=" forward . 10.1.0.1 {"
time="2022-01-12T13:08:43Z" level=info msg=" policy sequential"
time="2022-01-12T13:08:43Z" level=info msg=" }"
time="2022-01-12T13:08:43Z" level=info msg=" cache 30"
time="2022-01-12T13:08:43Z" level=info msg=" reload"
time="2022-01-12T13:08:43Z" level=info msg=" template IN A company.corp {"
time="2022-01-12T13:08:43Z" level=info msg=" match .*.apps.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" answer \"{{ .Name }} 60 in {{ .Type }} 10.1.4.2\""
time="2022-01-12T13:08:43Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:43Z" level=info msg=" }"
time="2022-01-12T13:08:43Z" level=info msg=" template IN AAAA company.corp {"
time="2022-01-12T13:08:43Z" level=info msg=" match .*.apps.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:43Z" level=info msg=" }"
time="2022-01-12T13:08:43Z" level=info msg=" template IN A company.corp {"
time="2022-01-12T13:08:43Z" level=info msg=" match api.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" answer \"{{ .Name }} 60 in {{ .Type }} 10.1.4.1\""
time="2022-01-12T13:08:43Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:43Z" level=info msg=" }"
time="2022-01-12T13:08:43Z" level=info msg=" template IN AAAA company.corp {"
time="2022-01-12T13:08:43Z" level=info msg=" match api.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:43Z" level=info msg=" }"
time="2022-01-12T13:08:43Z" level=info msg=" template IN A company.corp {"
time="2022-01-12T13:08:43Z" level=info msg=" match api-int.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" answer \"{{ .Name }} 60 in {{ .Type }} 10.1.4.1\""
time="2022-01-12T13:08:43Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:43Z" level=info msg=" }"
time="2022-01-12T13:08:43Z" level=info msg=" template IN AAAA company.corp {"
time="2022-01-12T13:08:43Z" level=info msg=" match api-int.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:43Z" level=info msg=" }"
time="2022-01-12T13:08:43Z" level=info msg=" hosts {"
time="2022-01-12T13:08:43Z" level=info msg=" 10.1.2.189 lab4-h9zq6-master-0 lab4-h9zq6-master-0.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" 10.1.2.190 lab4-h9zq6-master-1 lab4-h9zq6-master-1.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" 10.1.2.188 lab4-h9zq6-master-2 lab4-h9zq6-master-2.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" 10.1.2.205 lab4-h9zq6-worker-dlr5x lab4-h9zq6-worker-dlr5x.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" 10.1.2.203 lab4-h9zq6-worker-k8dfd lab4-h9zq6-worker-k8dfd.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" 10.1.2.207 lab4-h9zq6-worker-m5lqk lab4-h9zq6-worker-m5lqk.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" 10.1.2.209 lab4-h9zq6-worker-v95g9 lab4-h9zq6-worker-v95g9.company.corp"
time="2022-01-12T13:08:43Z" level=info msg=" fallthrough"
time="2022-01-12T13:08:43Z" level=info msg=" }"
time="2022-01-12T13:08:43Z" level=info msg="}"
time="2022-01-12T13:08:43Z" level=info
time="2022-01-12T13:08:43Z" level=info msg="Runtimecfg rendering template" path=/etc/coredns/Corefile
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.