Giter Club home page Giter Club logo

Comments (22)

ferpizza avatar ferpizza commented on May 10, 2024 5

Hi,

I've been dealing with these false positives on GKE. After investigating a little, I realized that GKE doesn't expose the Kubernetes Scheduler nor the Control Manager to end users.

As we are blinded to these services, there is no need for deploying neither the Scheduler Scraper nor the Control Manager Scraper or their respective Alerts.

The easiest way of dealing with these false positive alerts is to disable the Scraping and Alerts related to services managed by GKE on the Values file of the Helm Chart.

kubeControllerManager:
  enabled: false

kubeScheduler:
  enabled: false

This is probably the case for other cloud providers, although I'm not sure about it.

Cheers,

from kube-prometheus.

sandromello avatar sandromello commented on May 10, 2024 4

I had a similar issue, but I've used kubeadm to install the cluster. I fixed those alerts editing selector of those services.

If you have kubernetes core components as pods in the kube-system namespace, make sure the label selector of those services match with the labels of the pods.

kubectl get svc kube-prom-exporter-kube-scheduler kube-prom-exporter-kube-controller-manager -n kube-system -o wide
NAME                                         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)     AGE       SELECTOR
kube-prom-exporter-kube-scheduler            ClusterIP   None         <none>        10251/TCP   3h        component=kube-scheduler
kube-prom-exporter-kube-controller-manager   ClusterIP   None         <none>        10252/TCP   3h        component=kube-controller-manager
kubectl get po -l component -n kube-system --show-labels
NAME                                                 READY     STATUS    RESTARTS   AGE       LABELS
(...)
kube-apiserver-ip-10-0-41-71.ec2.internal            1/1       Running   0          3h        component=kube-apiserver,tier=control-plane
kube-controller-manager-ip-10-0-41-71.ec2.internal   1/1       Running   0          3h        component=kube-controller-manager,tier=control-plane
kube-scheduler-ip-10-0-41-71.ec2.internal            1/1       Running   0          3h        component=kube-scheduler,tier=control-plane

If any of those components were started bound to 127.0.0.1 you need to change that, please take a look at kubeadm on prometheus for more information.

from kube-prometheus.

hamid2013 avatar hamid2013 commented on May 10, 2024 1

I am also facing same issue, but in my case i have used Azure acs-engine to launch the cluster.

Keep getting the Scheduler and Controller alert.

I can see the pods are running, but there is no corresponding service available there.

from kube-prometheus.

chris530 avatar chris530 commented on May 10, 2024 1

I noticed the labels the service was looking for was not returning any pods. After adding the label k8s-app=kube-controller-manager to the control manager, and k8s-app=kube-scheduler to the scheduler the alerts cleared up as the service could find pods now.

from kube-prometheus.

domcar avatar domcar commented on May 10, 2024

If it helps, it looks like some services have no endpoints:

kubectl get endpoints 
kube-system   kube-controller-manager                            <none>                                                           19h
kube-system   kube-prometheus-exporter-kube-scheduler            <none>                                                           24m

from kube-prometheus.

domcar avatar domcar commented on May 10, 2024

@sandromello The problem is that I don't have the Pods kube-scheduler or controller-manager. I think this is the reason why it doesn't work

from kube-prometheus.

ScottBrenner avatar ScottBrenner commented on May 10, 2024

This is a known issue with GKE prometheus-operator/prometheus-operator#355 prometheus-operator/prometheus-operator#845. I ended up just deleting the two alerts.

from kube-prometheus.

hameno avatar hameno commented on May 10, 2024

This also seems to be the case for https://github.com/rancher/rke deployments (at least it is happening on my dev cluster)

from kube-prometheus.

gianrubio avatar gianrubio commented on May 10, 2024

@domcar one way to avoid this issue is to have a flag to control if some dependencies from kube-prometheus will be deployed. Looking on alertmanager example on how it's possible to skip the installation of a dependency.

PR are always welcome :)

from kube-prometheus.

 avatar commented on May 10, 2024

I don't have any endpoints for kube controller manager and scheduler then how to monitor them using prometheus and prometheus operator.

Alerts are being triggered from the alert manager

from kube-prometheus.

bonovoxly avatar bonovoxly commented on May 10, 2024

@ScottBrenner what's the best way to delete an alert using helm? Is it possible to cherry-pick out the alerts, or would I need to recreate them all (minus the non-working alerts for GKE)?

from kube-prometheus.

ScottBrenner avatar ScottBrenner commented on May 10, 2024

@bonovoxly Was using kube-prometheus, never touched Helm.

from kube-prometheus.

ne1000 avatar ne1000 commented on May 10, 2024

@domcar @ScottBrenner I also met the same issue, but in my case i have used binary packages to install the cluster , can you give me a piece of advice fix the issue?

from kube-prometheus.

phyllisstein avatar phyllisstein commented on May 10, 2024

I ran into this issue with a cluster deployed through kops in AWS. The solution that worked for me was sitting in an old version of the repo: I had to deploy the services listed here to kube-system. With that done, the alerts went green.

Edit: N.B. that I think you can also generate the requisite files by adding (import 'kube-prometheus/kube-prometheus-kops.libsonnet') to your JSONnet config:

local kp =
  (import 'kube-prometheus/kube-prometheus.libsonnet') +
  (import 'kube-prometheus/kube-prometheus-kops.libsonnet') +
  {
    _config+:: {
      namespace: 'monitoring',
  /* ...etc. */

from kube-prometheus.

 avatar commented on May 10, 2024

Same issue with aws eks

from kube-prometheus.

vrathore18 avatar vrathore18 commented on May 10, 2024

I am facing the same issue. I don't have the Pods kube-scheduler or controller-manager. @domcar how did you fixed the issue??

P.S I used helm for installation. CLoud using: AWS

from kube-prometheus.

rpf3 avatar rpf3 commented on May 10, 2024

@chris530 I had to do something very similar to the service selectors; basically null out the component label and add k8s-app label to the selector for those two services.

from kube-prometheus.

flogfy avatar flogfy commented on May 10, 2024

@chris530 how were you able to add these labels to the controller manager and the kube scheduler ? I don't even have the pods and services associated with neither kube-scheduler nor kube-controller-manager. My kubernetes is installed with RKE.

from kube-prometheus.

woody3549 avatar woody3549 commented on May 10, 2024

Hello,

I am currently using prometheus-stack version 20.0.1
Alerts KubeSchedulerDown and KubeControllerManagerDown are currently being raised for no apparent reason.
Is that also a label issues, please ?
How did you solve it ?

Thanks for your help.
Regards,

from kube-prometheus.

woody3549 avatar woody3549 commented on May 10, 2024

Hi @ferpizza,

Now I no longer receive alerts for KubeScheduler and KubeControllerManager.
Thanks.

However, a new KubeProxyDown alert now appears.
Can you please point me out what GKE exposes ?
I might have to disable it as well.

Cheers

from kube-prometheus.

ferpizza avatar ferpizza commented on May 10, 2024

Hello @woody3549,

I haven't found official documentation setting apart those k8s components that are exposed to end-users form the ones that are kept private for Google's management. You can make an assumption based on whether such component is key for ensuring GKE services.

kube-proxy is one of those components, being a critical piece in the networking of your cluster.

When I wrote my first comment I was on version 18.1.1 of the Kube Prometheus Stack helm chart, and that version did not include the kube-proxy alerts or scraper.

Since then I have updated to version 27.1.0, which includes the kube-proxy alert, and was confronted with the same issue regarding false positives.

We can solve this, and the two prior alerts, by adding the following lines to our Values file.

kubeControllerManager:
  enabled: false

kubeScheduler:
  enabled: false

kubeProxy:
  enabled: false

from kube-prometheus.

woody3549 avatar woody3549 commented on May 10, 2024

Hello,

Ok thanks. This makes sense and is very helpful.

Regards

from kube-prometheus.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.