Comments (22)
Hi,
I've been dealing with these false positives on GKE. After investigating a little, I realized that GKE doesn't expose the Kubernetes Scheduler nor the Control Manager to end users.
As we are blinded to these services, there is no need for deploying neither the Scheduler Scraper nor the Control Manager Scraper or their respective Alerts.
The easiest way of dealing with these false positive alerts is to disable the Scraping and Alerts related to services managed by GKE on the Values file of the Helm Chart.
kubeControllerManager:
enabled: false
kubeScheduler:
enabled: false
This is probably the case for other cloud providers, although I'm not sure about it.
Cheers,
from kube-prometheus.
I had a similar issue, but I've used kubeadm
to install the cluster. I fixed those alerts editing selector of those services.
If you have kubernetes core components as pods in the kube-system
namespace, make sure the label selector of those services match with the labels of the pods.
kubectl get svc kube-prom-exporter-kube-scheduler kube-prom-exporter-kube-controller-manager -n kube-system -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR
kube-prom-exporter-kube-scheduler ClusterIP None <none> 10251/TCP 3h component=kube-scheduler
kube-prom-exporter-kube-controller-manager ClusterIP None <none> 10252/TCP 3h component=kube-controller-manager
kubectl get po -l component -n kube-system --show-labels
NAME READY STATUS RESTARTS AGE LABELS
(...)
kube-apiserver-ip-10-0-41-71.ec2.internal 1/1 Running 0 3h component=kube-apiserver,tier=control-plane
kube-controller-manager-ip-10-0-41-71.ec2.internal 1/1 Running 0 3h component=kube-controller-manager,tier=control-plane
kube-scheduler-ip-10-0-41-71.ec2.internal 1/1 Running 0 3h component=kube-scheduler,tier=control-plane
If any of those components were started bound to
127.0.0.1
you need to change that, please take a look at kubeadm on prometheus for more information.
from kube-prometheus.
I am also facing same issue, but in my case i have used Azure acs-engine to launch the cluster.
Keep getting the Scheduler and Controller alert.
I can see the pods are running, but there is no corresponding service available there.
from kube-prometheus.
I noticed the labels the service was looking for was not returning any pods. After adding the label k8s-app=kube-controller-manager to the control manager, and k8s-app=kube-scheduler to the scheduler the alerts cleared up as the service could find pods now.
from kube-prometheus.
If it helps, it looks like some services have no endpoints:
kubectl get endpoints
kube-system kube-controller-manager <none> 19h
kube-system kube-prometheus-exporter-kube-scheduler <none> 24m
from kube-prometheus.
@sandromello The problem is that I don't have the Pods kube-scheduler
or controller-manager
. I think this is the reason why it doesn't work
from kube-prometheus.
This is a known issue with GKE prometheus-operator/prometheus-operator#355 prometheus-operator/prometheus-operator#845. I ended up just deleting the two alerts.
from kube-prometheus.
This also seems to be the case for https://github.com/rancher/rke deployments (at least it is happening on my dev cluster)
from kube-prometheus.
@domcar one way to avoid this issue is to have a flag to control if some dependencies from kube-prometheus will be deployed. Looking on alertmanager example on how it's possible to skip the installation of a dependency.
PR are always welcome :)
from kube-prometheus.
I don't have any endpoints for kube controller manager and scheduler then how to monitor them using prometheus and prometheus operator.
Alerts are being triggered from the alert manager
from kube-prometheus.
@ScottBrenner what's the best way to delete an alert using helm? Is it possible to cherry-pick out the alerts, or would I need to recreate them all (minus the non-working alerts for GKE)?
from kube-prometheus.
@bonovoxly Was using kube-prometheus
, never touched Helm.
from kube-prometheus.
@domcar @ScottBrenner I also met the same issue, but in my case i have used binary packages to install the cluster , can you give me a piece of advice fix the issue?
from kube-prometheus.
I ran into this issue with a cluster deployed through kops in AWS. The solution that worked for me was sitting in an old version of the repo: I had to deploy the services listed here to kube-system
. With that done, the alerts went green.
Edit: N.B. that I think you can also generate the requisite files by adding (import 'kube-prometheus/kube-prometheus-kops.libsonnet')
to your JSONnet config:
local kp =
(import 'kube-prometheus/kube-prometheus.libsonnet') +
(import 'kube-prometheus/kube-prometheus-kops.libsonnet') +
{
_config+:: {
namespace: 'monitoring',
/* ...etc. */
from kube-prometheus.
Same issue with aws eks
from kube-prometheus.
I am facing the same issue. I don't have the Pods kube-scheduler or controller-manager. @domcar how did you fixed the issue??
P.S I used helm for installation. CLoud using: AWS
from kube-prometheus.
@chris530 I had to do something very similar to the service selectors; basically null out the component
label and add k8s-app
label to the selector for those two services.
from kube-prometheus.
@chris530 how were you able to add these labels to the controller manager and the kube scheduler ? I don't even have the pods and services associated with neither kube-scheduler nor kube-controller-manager. My kubernetes is installed with RKE.
from kube-prometheus.
Hello,
I am currently using prometheus-stack version 20.0.1
Alerts KubeSchedulerDown and KubeControllerManagerDown are currently being raised for no apparent reason.
Is that also a label issues, please ?
How did you solve it ?
Thanks for your help.
Regards,
from kube-prometheus.
Hi @ferpizza,
Now I no longer receive alerts for KubeScheduler and KubeControllerManager.
Thanks.
However, a new KubeProxyDown alert now appears.
Can you please point me out what GKE exposes ?
I might have to disable it as well.
Cheers
from kube-prometheus.
Hello @woody3549,
I haven't found official documentation setting apart those k8s components that are exposed to end-users form the ones that are kept private for Google's management. You can make an assumption based on whether such component is key for ensuring GKE services.
kube-proxy
is one of those components, being a critical piece in the networking of your cluster.
When I wrote my first comment I was on version 18.1.1 of the Kube Prometheus Stack helm chart, and that version did not include the kube-proxy
alerts or scraper.
Since then I have updated to version 27.1.0, which includes the kube-proxy
alert, and was confronted with the same issue regarding false positives.
We can solve this, and the two prior alerts, by adding the following lines to our Values file.
kubeControllerManager:
enabled: false
kubeScheduler:
enabled: false
kubeProxy:
enabled: false
from kube-prometheus.
Hello,
Ok thanks. This makes sense and is very helpful.
Regards
from kube-prometheus.
Related Issues (20)
- 怎么修改Prometheus全局配置
- how to add new relabel_configs and metric_relabel_configs for job_name monitoring/kube-state-metrics/0 and job_name monitoring/kube-state-metrics/1 HOT 1
- How to add and modify existing scrape_configs to existing kube-prometheus HOT 1
- done HOT 8
- Prometheus stack not respecting retention
- how to not use cpu requst & limit ? HOT 1
- Appears to be an issue with the Grafana grid.libsonnet HOT 3
- Support the Gateway API
- forgot grafana password,How can I changeit
- persistent volume monitor
- Decrease value of limit before "Namespace quota is going to be full.
- Warning - 'bearerTokenFile' is deprecated HOT 1
- Request for Guidance on Deploying kube-prometheus Using ArgoCD
- Easily configure arguments of node exporter
- kube-prometheus-release-0.13 不能与rancher 同时部署吗
- ./build.sh example.jsonnet HOT 2
- Error in grafonnet HOT 1
- upgrade node-exporter to v1.8.0
- Grafana not reachable after setup
- example dashboards use angular which is depreciated in grafana
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from kube-prometheus.