Giter Club home page Giter Club logo

Comments (15)

gao-yan avatar gao-yan commented on July 19, 2024 1

It makes sense to provide only the most basic and important info for the first step.

And let's check how things and output fields from corosync-cmapctl are different between corosync-2.4 and corosync-3 so that we have better ideas on the options.

from ha_cluster_exporter.

ReyRen avatar ReyRen commented on July 19, 2024 1

@MalloZup @gao-yan Sorry for late respond. Here is the link status provided by corosync-cfgtool in corosync 3+:

sles15sp1vm2:~ # corosync-cfgtool -s
Printing link status.
Local node ID 2
LINK ID 0
        addr    = 192.168.122.23
        status:
                nodeid  1:      link enabled:1  link connected:1
                nodeid  2:      link enabled:1  link connected:1

There is no concept RRP and SRP. The link all decided by knet, which supports up to 8 links. There are different priorities set in different links. More details in corosync-cfgtool is:

-s     Displays  the status of the current links on this node for UDP/UDPU, with extended status for KNET.  After each link, the nodes on that link
              are displayed in order with their status, for example there are 3 nodes with KNET transportation: LINK ID 0:
                  id     = 192.168.100.80
                  status:
                      node 0: link enabled: 1     link connected: 1
                      node 1: link enabled: 1     link connected: 1
                      node 2: link enabled: 1     link connected: 1

-b     Displays the brief status of the current links on this node (KNET only) when used with "-s". If any interfaces are faulty, 1 is returned  by
              the  binary.  If  all interfaces are active 0 is returned to the shell.  After each link, the nodes on that link are displayed in order with
              their status encoded into a single digit. 1=link enabled, 2=link connected, So a 3 in a node  position  indicates  that  the  link  is  both
              enabled  and connected.  The local link (which will only ever be enabled on link 0) shows as enabled but not connected for internal reasons.
              The output will be: LINK ID 0:
                  id     = 192.168.100.80
                  status = 333

Is that helpful for you? Or any others I can make help?

from ha_cluster_exporter.

MalloZup avatar MalloZup commented on July 19, 2024

@diegoakechi to me we should consider :

this is an output

Printing ring status.
	Local node ID 16777226
	RING ID 0
			id      = 10.0.0.1
			status  = Marking ringid 0 interface 10.0.0.1 FAULTY
	RING ID 1
			id      = 172.16.0.1
			status  = ring 1 active with no fault

Now if we want to have a label by ring id, which ID should be take?
RING ID 0 or id = 10.0.0.1

Also right now we have the total of failure. To me this could be as minimalist approach .

If we need more precision we can do it, but if we go to the more detailed approach we don't need the total metrics anymore since this can be done via promql

from ha_cluster_exporter.

gao-yan avatar gao-yan commented on July 19, 2024

Indeed the output of corosync-cfgtool is kind of confusing. How about adding a new field "ring_address" or combining the information like "0 (address: 10.0.0.1)"?

from ha_cluster_exporter.

MalloZup avatar MalloZup commented on July 19, 2024

@gao-yan to me we could add it yes. I just hope the output stay the same since we will use it as api😅

from ha_cluster_exporter.

MalloZup avatar MalloZup commented on July 19, 2024

@gao-yan can a ring have multiple adress? 🤔

from ha_cluster_exporter.

MalloZup avatar MalloZup commented on July 19, 2024

adding this would require some anti-pattern things in prometheus. I am not sure if we need it ..

from ha_cluster_exporter.

gao-yan avatar gao-yan commented on July 19, 2024

I don't think output of corosync-cfgtool changes often. But the output between corosync-2.4 and corosync-3 are different:

2.4:
https://github.com/corosync/corosync/blob/needle-2.4/tools/corosync-cfgtool.c#L68

3.0:
https://github.com/corosync/corosync/blob/master/tools/corosync-cfgtool.c#L90

But we will likely take corosync-3 only starting from SLE 16.

In corosync-2.4, corosync-cmapctl outputs the info in better format, for example some relevant information from there:

runtime.totem.pg.mrp.rrp.0.faulty (u8) = 1
runtime.totem.pg.mrp.srp.members.1084783184.ip (str) = r(0) ip(192.168.122.80) r(1) ip(127.0.0.1)
totem.interface.0.bindnetaddr (str) = 192.168.122.0

I don't have corosync-3 in hands, but IIRC, rrp (Redundant Ring Protocol) is even dropped. https://jira.suse.com/browse/PM-1203

@yuanren10 should have better understanding/suggestions on the topics in here :-)

from ha_cluster_exporter.

gao-yan avatar gao-yan commented on July 19, 2024

@gao-yan can a ring have multiple adress?

I don't think so. A ring/interface can only be configured with one "bindnetaddr".

from ha_cluster_exporter.

MalloZup avatar MalloZup commented on July 19, 2024

ok thx @gao-yan . 🌞 So I think for a first corosync 0.5 version of the exporter, I would go with only the metric total. which is the most simple one

For the other metric we need to research a bit if we needed it or not to rely on such output.

IMHO
corosync-cmapctl seems a valid tool to me more handy to parse. If it is stable we might just use that.

from ha_cluster_exporter.

MalloZup avatar MalloZup commented on July 19, 2024

yes agree. thx @gao-yan 🚀

from ha_cluster_exporter.

ReyRen avatar ReyRen commented on July 19, 2024

Same with mentioned by @gao-yan , in corosync2.4.+, link status seems only can be tell by return code of "corosync-cfgtool" and "runtime.totem.pg.mrp.rrp.0.faulty". But "runtime.totem.pg.mrp.rrp.0.faulty" removed in corosync3+, because it's seems enough using "corosync-cfgtool" to show the status

from ha_cluster_exporter.

MalloZup avatar MalloZup commented on July 19, 2024

@ReyRen thx so far. I think for moment this issue is not super urgent but thx for all info was helpfull 🚀

from ha_cluster_exporter.

mbothorel avatar mbothorel commented on July 19, 2024

Hi, I can confirm the issue using corosync 3+ in Debian10.
Any plan to support it ? How can I help ?

from ha_cluster_exporter.

MalloZup avatar MalloZup commented on July 19, 2024

@mbothorel Hi.

If you want to help on this, you need to create a new metric with some labels.
Some doc:

https://github.com/ClusterLabs/ha_cluster_exporter/blob/master/doc/design.md

Also a part of setting up the development env., check my first comment on the issue which give an hint what we need to achieve.

If you have any question feel free to ping me. There is no stupid question, 😁

If you wanna work on it, I can assign this to you.

Let me know and thank you for proposing it!

from ha_cluster_exporter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.