Giter Club home page Giter Club logo

Comments (21)

anishasthana avatar anishasthana commented on September 24, 2024 2

I think we should have a default approach for general application monitoring, and then when it comes to getting cluster metrics we have a documented procedure that is specifically for it. I think federation with a special token would be the best way. Most people generally won't be able to get elevated perms like that out of the box, right?

from apps.

anishasthana avatar anishasthana commented on September 24, 2024 2

@4n4nd We did something similar at https://gitlab.cee.redhat.com/data-hub/dh-monitoring/-/blob/master/prometheus/prometheus.yaml#L2638

from apps.

4n4nd avatar 4n4nd commented on September 24, 2024 1

Test results:

  1. When the servicemonitors are in the same namespace as prometheus, with the namespace for the target service specified, the prometheus SA just needs to be able to get/list/watch services from the target namespace and it is able to monitor services across namespaces.

  2. When the servicemonitors are not in the same namespace as prometheus, then the prometheus needs to be able to get namespaces (cluster role) as well as get/list/watch servicemonitors/podmonitors/services/pods/endpoints in the target namespaces.
    If we apply this role cluster wide, it would look like:

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: operatefirst-monitoring
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  verbs:
  - get
- apiGroups:
  - ""
  resources:
  - services
  - endpoints
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - monitoring.coreos.com
  resources:
  - servicemonitors
  - podmonitors
  verbs:
  - get
  - list
  - watch

Method 2 would also fix #60, i.e. we would have permissions to access cluster metrics enabling better monitoring for our pods/deployments.

from apps.

4n4nd avatar 4n4nd commented on September 24, 2024 1

so if we go ahead with Option 1 (service monitors in the same namespace), how easy/hard would it be to access cluster metrics?

@hemajv we would need to create a servicemonitor in the monitoring namespace and need access to the service resource in the namespace where kube-state-metrics have been set-up or are available.

from apps.

HumairAK avatar HumairAK commented on September 24, 2024 1

I think that's out of scope for this issue. Let's discuss that here.

I thought projects are specific to Openshift and Prometheus-Operator being a Kubernetes operator I doubt that would work but I can try it out +1

They are, they belong to the project.openshift.io api group. I too doubt this would work but if it's easy to test it out, then there's no harm in making sure.

from apps.

HumairAK avatar HumairAK commented on September 24, 2024 1

Thanks for all the feedback guys, I think we can declare this investigation a success as we have confirmed the usage of rbac+SAs to work around operatorgroups, thanks for all efforts @4n4nd . Great work.

Closing this issue, if there's any further discussion/questions feel free to re-open.

from apps.

tumido avatar tumido commented on September 24, 2024

So in other words... we found the solution! 🙂 Do I understand it correctly? 🎉 I like the method 2, since it keeps the monitors and services it monitors together.

Does the method 2 also work without a cluster role? I mean, if you use a Role and RoleBinding in each target namespace giving access to the prometheus SA from the monitoring namespace.

from apps.

4n4nd avatar 4n4nd commented on September 24, 2024

So in other words... we found the solution! 🙂 Do I understand it correctly? 🎉 I like the method 2, since it keeps the monitors and services it monitors together.

Does the method 2 also work without a cluster role? I mean, if you use a Role and RoleBinding in each target namespace giving access to the prometheus SA from the monitoring namespace.

We will need at least one cluster role that will let the Prometheus do a get on namespaces.

from apps.

tumido avatar tumido commented on September 24, 2024

why? You can do a role for get on the target namespaces only. But I agree that a single cluster role, may be simpler to maintain.

from apps.

tumido avatar tumido commented on September 24, 2024

BTW... It just came to my mind... If we use which cluster role. The prometheus would scape all the *monitors in the cluster, wouldn't it? Like .. it there's somebody else deploying a service monitor in some non-operate-first namespace, we would still scrape that, in case we use the cluster role. I don't think we want that. I would rather try the Roles scenario if possible.

from apps.

4n4nd avatar 4n4nd commented on September 24, 2024

You can do a role for get on the target namespaces only

I don't think you can. To do something like a oc get namespace -l key=value you need get access on namespaces clusterwide.

we would still scrape that

if set our selector labels properly, that should not be an issue, we can define a label prometheus: operate-first for all our service monitors and it should be fine.

from apps.

HumairAK avatar HumairAK commented on September 24, 2024

Awesome work @4n4nd, so correct me if I'm wrong but it boils down to this:

Option 1:

  • We can have service monitors same namespace

Pro:

  • no cluster rbac

Con:

  • We need to create the service monitors, ns owners need to create NS scoped rbac

Option 2:

  • We can put service monitors in different namespaces

Pro:

  • Service monitors can go with different namespaces, resulting in shifting onus to enable monitoring to namespace admins (less work for us in the future)

Con:

  • Need cluster rbac

Personally I think we should go with option 1. It makes it easier to have other people replicate our setups if we limit cluster admin requirements, and if we can get services working w/o cluster rbac that's a good thing. I'm on the fence here, so open to being convinced.

from apps.

anishasthana avatar anishasthana commented on September 24, 2024

++ what Humair said. I think removing the need for cluster-wide perms where possible makes it easier for people to use the operate-first setup. (So I'm supporting Option 1)

from apps.

tumido avatar tumido commented on September 24, 2024

hold on, give me a sec to write down the POC, before we decide...

from apps.

tumido avatar tumido commented on September 24, 2024

So.. it seems the namespace resource needs the cluster scope, however project resources don't:

$ oc new-project test1
Now using project "test1" on server "https://api.tcoufaltest.lab.pnq2.cee.redhat.com:6443".
...

$ oc new-project test2
Now using project "test2" on server "https://api.tcoufaltest.lab.pnq2.cee.redhat.com:6443".
...

$ oc new-project test3
Now using project "test3" on server "https://api.tcoufaltest.lab.pnq2.cee.redhat.com:6443".
...

$ oc create serviceaccount -n test1 test-sa
serviceaccount/test-sa created

$ oc apply -f rbac.yaml -n test1
role.rbac.authorization.k8s.io/test-role created
rolebinding.rbac.authorization.k8s.io/test-rb created

$ oc apply -f rbac.yaml -n test2
role.rbac.authorization.k8s.io/test-role created
rolebinding.rbac.authorization.k8s.io/test-rb created

$ oc apply -f rbac.yaml -n test3
role.rbac.authorization.k8s.io/test-role created
rolebinding.rbac.authorization.k8s.io/test-rb created

$ oc login --token=TEST_SA_TOKEN

$ oc get namespaces
Error from server (Forbidden): namespaces is forbidden: User "system:serviceaccount:test1:test-sa" cannot list resource "namespaces" in API group "" at the cluster scope

$ oc get projects
NAME    DISPLAY NAME   STATUS
test1                  Active
test2                  Active
test3                  Active

Using this rbac.yaml:

---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: test-role
rules:
  - apiGroups:
      - project.openshift.io
    resources:
      - projects
    verbs:
      - get
      - list
      - watch
  - apiGroups:
      - ""
    resources:
      - namespaces
    verbs:
      - get
      - list
      - watch
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: test-rb
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: test-role
subjects:
  - kind: ServiceAccount
    name: test-sa
    namespace: test1

@4n4nd this is basically the exercise I wanted you to prove. Please try out things first next time. The question is. Do we need to be able list namespaces? Or are project resources fine? Do you know how the prometheus operator approaches this? What does it need? Namespaces or projects?

To conclude on this from my point of view. I don't like cluster roles in this case and working around them in by labels and what not, seems like a bad idea to me. If we can't use this option 3 - with scoping the namespace/project access as shown above, I would choose number 1, We don't want to play this on cluster scope - that would effectively equal cluster wide Prometheus deployment anyways (which we don't want now).

from apps.

hemajv avatar hemajv commented on September 24, 2024

so if we go ahead with Option 1 (service monitors in the same namespace), how easy/hard would it be to access cluster metrics?

from apps.

4n4nd avatar 4n4nd commented on September 24, 2024

@tumido I am sorry, I am a little confused on what this experiment proves.

Do we need to be able list namespaces?

yes

Or are project resources fine?

I thought projects are specific to Openshift and Prometheus-Operator being a Kubernetes operator I doubt that would work but I can try it out 👍

Do you know how the prometheus operator approaches this?

The operator looks for namespaces using label matchers or expression matchers.

I don't like cluster roles in this case

I agree 💯

I would choose number 1

Sounds good ✔️

from apps.

4n4nd avatar 4n4nd commented on September 24, 2024

@anishasthana Do you mean we could use a token to collect metrics from the cluster-monitoring prometheus instance? If we can do that, it would be a good way to get metrics and a better solution than what I said about accessing services above ^.

from apps.

hemajv avatar hemajv commented on September 24, 2024

I also agree @anishasthana about having a default approach for general application monitoring and I like the idea of providing supporting documentation for additional monitoring needs. Im a ++ for option 1, as it seems like an approach that others could easily follow and implement

from apps.

4n4nd avatar 4n4nd commented on September 24, 2024

Awesome! it looks like the consensus is for option 1. Now we need to decide where we should keep the pod/service monitors in the repo. Should we just create a "service-monitors" and "pod-monitors" directories in the monitoring base?

from apps.

tumido avatar tumido commented on September 24, 2024

@tumido I am sorry, I am a little confused on what this experiment proves.

@4n4nd I have to apologize. I've got a bit frustrated with the dismissal of the idea without even trying it out. So I had to try it out and share the results in here. That experiment proved you were right, that you can't list namespaces without cluster scope permissions. My assumption was based on the project resouces behavior, which I knew that works, since I used that approach to make ODH dashboard multi-namespace. I didn't know namespaces behave that differently. I apologize for my frustration.

from apps.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.