The ability to time checks using cron syntax is great. However, we use systemd OnCalen

The initial implementation is ready, and deployed to <a href="https://healthchecks.io"

Support Systemd OnCalendar timers about healthchecks HOT 8 CLOSED

ravench commented on June 11, 2024 1

Support Systemd OnCalendar timers

from healthchecks.

Comments (8)

cuu508 commented on June 11, 2024 1

Thanks for the suggestion. I've been thinking about this for a long time as well! The main barrier is the absence of a python library for parsing and evaluating OnCalendar schedules. Forking out to systemd-analyze calendar ... would work, but would not be ideal in terms of security, performance, and portability.

No promises yet, but I've started work on a python library for this: https://github.com/cuu508/oncalendar/

from healthchecks.

cuu508 commented on June 11, 2024

The initial implementation is ready, and deployed to https://healthchecks.io. All testing welcome :-)

Here's how the "Update Schedule" dialog looks:

from healthchecks.

ravench commented on June 11, 2024

Awesome, thanks. I've deployed 3.1-dev to our server and adjusted our scripts, seems to work well so far with a few dozen checks and various timer formats.

from healthchecks.

ravench commented on June 11, 2024

A little issue I ran in to involves grace time and RandomizeDelaySec:

We use fairly large random delays (up to 3600s) in our Jobs, since we have many that trigger at the same time and use the same resources. This means that using schedules instead of timeouts in Healthckecks, we constantly have Jobs that are in their grace period, waiting for the random delay of the systemd.timer to pass. We trigger the /start API call with ExecStartPre= in the service, so not really any way of triggering that call earlier.
It's sub-optimal, since this causes our project to become a bit of a christmas tree, but it doesn't cause any issues with false alerts, we just set hc_gracetime = randomizedelaysec + timeoutsec.

I wonder weather it would make some sense to add a 'green grace time', but that raises even more questions for me:
What is grace time actually intended for in a complete setup, using /start , /<exit-code and possibly log. I see three relevant durations in this context:

Schedule offset: The expected time between the scheduled and the effective start time of the job. (RandomizeDelaySec and AccuracySec in Systemd)
Duration: The maximum time between start and end of the job. (TimeoutSec in Systemd)
Grace Time: Time to delay warnings to catch unexpected delays.

I imagine handling these delays separately would be fairly complicated and it isn't a priority, since it's only really relevant for Systemd and gracetime can just be set accordingly.

from healthchecks.

cuu508 commented on June 11, 2024

I wonder weather it would make some sense to add a 'green grace time'

I've thought about making icons gradually shift from green to orange as they progress through the grace window. But I'm not sure if this would be an improvement, you would still see non-pure-green statuses, and it may in the end look even more busy with many different shades of green/orange.

Grace time was originally (A) the time to delay alerts when a success ping does not arrive on time.

When I added support for the /start signal, I made the grace time to serve a double duty, and also (B) constrain the maximum time gap between the start and success signal. This means users cannot tune A and B separately, but this does not seem to be a big issue in the practice. And it avoids having another slider in the UI.

Grace time can also be used to account for random startup delay, and for the client system's clock being slightly off.

from healthchecks.

ravench commented on June 11, 2024

How about adding a configurable percentage threshold for the icon changing to yellow?
I agree that gradual color shift would probably be more confusing than helpful. But just being able to configure "only turn yellow if 40% of the grace time has elapsed" would probably cover most use- and edgecases.

Question regarding grace time: If I have a Job scheduled for 00:00, a grace time of 1h and pings at 00:30 and 01:29, what would the behavior be?

from healthchecks.

cuu508 commented on June 11, 2024

But just being able to configure "only turn yellow if 40% of the grace time has elapsed" would probably cover most use- and edgecases.

But it would add a configuration setting, that would need to be tucked in the UI somewhere, and explained in the docs. My suggestion would be to think of the orange status icons not as an error condition, but as a sign a particular check will run soon. Same as with traffic lights where orange means "the light will change soon".

Question regarding grace time: If I have a Job scheduled for 00:00, a grace time of 1h and pings at 00:30 and 01:29, what would the behavior be?

Assuming the check is initially up,

at 00:00 the check's grace period will start, and the icon will change to orange
at 00:30, after receiving a ping, the icon will change back to green
after that, the next expected ping is at the next midnight. Any early pings (e.g. at 01:29) does not affect the status, it will stay green.

from healthchecks.

ravench commented on June 11, 2024

Ok thanks

from healthchecks.

Support Systemd OnCalendar timers about healthchecks HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent