Giter Club home page Giter Club logo

Comments (8)

cuu508 avatar cuu508 commented on June 11, 2024 1

Thanks for the suggestion. I've been thinking about this for a long time as well! The main barrier is the absence of a python library for parsing and evaluating OnCalendar schedules. Forking out to systemd-analyze calendar ... would work, but would not be ideal in terms of security, performance, and portability.

No promises yet, but I've started work on a python library for this: https://github.com/cuu508/oncalendar/

from healthchecks.

cuu508 avatar cuu508 commented on June 11, 2024

The initial implementation is ready, and deployed to https://healthchecks.io. All testing welcome :-)

Here's how the "Update Schedule" dialog looks:

image

from healthchecks.

ravench avatar ravench commented on June 11, 2024

Awesome, thanks. I've deployed 3.1-dev to our server and adjusted our scripts, seems to work well so far with a few dozen checks and various timer formats.

from healthchecks.

ravench avatar ravench commented on June 11, 2024

A little issue I ran in to involves grace time and RandomizeDelaySec:

We use fairly large random delays (up to 3600s) in our Jobs, since we have many that trigger at the same time and use the same resources. This means that using schedules instead of timeouts in Healthckecks, we constantly have Jobs that are in their grace period, waiting for the random delay of the systemd.timer to pass. We trigger the /start API call with ExecStartPre= in the service, so not really any way of triggering that call earlier.
It's sub-optimal, since this causes our project to become a bit of a christmas tree, but it doesn't cause any issues with false alerts, we just set hc_gracetime = randomizedelaysec + timeoutsec.

I wonder weather it would make some sense to add a 'green grace time', but that raises even more questions for me:
What is grace time actually intended for in a complete setup, using /start , /<exit-code and possibly log. I see three relevant durations in this context:

Schedule offset: The expected time between the scheduled and the effective start time of the job. (RandomizeDelaySec and AccuracySec in Systemd)
Duration: The maximum time between start and end of the job. (TimeoutSec in Systemd)
Grace Time: Time to delay warnings to catch unexpected delays.

I imagine handling these delays separately would be fairly complicated and it isn't a priority, since it's only really relevant for Systemd and gracetime can just be set accordingly.

from healthchecks.

cuu508 avatar cuu508 commented on June 11, 2024

I wonder weather it would make some sense to add a 'green grace time'

I've thought about making icons gradually shift from green to orange as they progress through the grace window. But I'm not sure if this would be an improvement, you would still see non-pure-green statuses, and it may in the end look even more busy with many different shades of green/orange.

Grace time was originally (A) the time to delay alerts when a success ping does not arrive on time.

When I added support for the /start signal, I made the grace time to serve a double duty, and also (B) constrain the maximum time gap between the start and success signal. This means users cannot tune A and B separately, but this does not seem to be a big issue in the practice. And it avoids having another slider in the UI.

Grace time can also be used to account for random startup delay, and for the client system's clock being slightly off.

from healthchecks.

ravench avatar ravench commented on June 11, 2024

How about adding a configurable percentage threshold for the icon changing to yellow?
I agree that gradual color shift would probably be more confusing than helpful. But just being able to configure "only turn yellow if 40% of the grace time has elapsed" would probably cover most use- and edgecases.

Question regarding grace time: If I have a Job scheduled for 00:00, a grace time of 1h and pings at 00:30 and 01:29, what would the behavior be?

from healthchecks.

cuu508 avatar cuu508 commented on June 11, 2024

But just being able to configure "only turn yellow if 40% of the grace time has elapsed" would probably cover most use- and edgecases.

But it would add a configuration setting, that would need to be tucked in the UI somewhere, and explained in the docs. My suggestion would be to think of the orange status icons not as an error condition, but as a sign a particular check will run soon. Same as with traffic lights where orange means "the light will change soon".

Question regarding grace time: If I have a Job scheduled for 00:00, a grace time of 1h and pings at 00:30 and 01:29, what would the behavior be?

Assuming the check is initially up,

  • at 00:00 the check's grace period will start, and the icon will change to orange
  • at 00:30, after receiving a ping, the icon will change back to green
  • after that, the next expected ping is at the next midnight. Any early pings (e.g. at 01:29) does not affect the status, it will stay green.

from healthchecks.

ravench avatar ravench commented on June 11, 2024

Ok thanks

from healthchecks.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.