Comments (8)
Thanks for the suggestion. I've been thinking about this for a long time as well! The main barrier is the absence of a python library for parsing and evaluating OnCalendar schedules. Forking out to systemd-analyze calendar ...
would work, but would not be ideal in terms of security, performance, and portability.
No promises yet, but I've started work on a python library for this: https://github.com/cuu508/oncalendar/
from healthchecks.
The initial implementation is ready, and deployed to https://healthchecks.io. All testing welcome :-)
Here's how the "Update Schedule" dialog looks:
from healthchecks.
Awesome, thanks. I've deployed 3.1-dev to our server and adjusted our scripts, seems to work well so far with a few dozen checks and various timer formats.
from healthchecks.
A little issue I ran in to involves grace time and RandomizeDelaySec
:
We use fairly large random delays (up to 3600s) in our Jobs, since we have many that trigger at the same time and use the same resources. This means that using schedules instead of timeouts in Healthckecks, we constantly have Jobs that are in their grace period, waiting for the random delay of the systemd.timer to pass. We trigger the /start
API call with ExecStartPre=
in the service, so not really any way of triggering that call earlier.
It's sub-optimal, since this causes our project to become a bit of a christmas tree, but it doesn't cause any issues with false alerts, we just set hc_gracetime = randomizedelaysec + timeoutsec
.
I wonder weather it would make some sense to add a 'green grace time', but that raises even more questions for me:
What is grace time actually intended for in a complete setup, using /start
, /<exit-code
and possibly log. I see three relevant durations in this context:
Schedule offset: The expected time between the scheduled and the effective start time of the job. (RandomizeDelaySec and AccuracySec in Systemd)
Duration: The maximum time between start and end of the job. (TimeoutSec in Systemd)
Grace Time: Time to delay warnings to catch unexpected delays.
I imagine handling these delays separately would be fairly complicated and it isn't a priority, since it's only really relevant for Systemd and gracetime can just be set accordingly.
from healthchecks.
I wonder weather it would make some sense to add a 'green grace time'
I've thought about making icons gradually shift from green to orange as they progress through the grace window. But I'm not sure if this would be an improvement, you would still see non-pure-green statuses, and it may in the end look even more busy with many different shades of green/orange.
Grace time was originally (A) the time to delay alerts when a success ping does not arrive on time.
When I added support for the /start
signal, I made the grace time to serve a double duty, and also (B) constrain the maximum time gap between the start and success signal. This means users cannot tune A and B separately, but this does not seem to be a big issue in the practice. And it avoids having another slider in the UI.
Grace time can also be used to account for random startup delay, and for the client system's clock being slightly off.
from healthchecks.
How about adding a configurable percentage threshold for the icon changing to yellow?
I agree that gradual color shift would probably be more confusing than helpful. But just being able to configure "only turn yellow if 40% of the grace time has elapsed" would probably cover most use- and edgecases.
Question regarding grace time: If I have a Job scheduled for 00:00, a grace time of 1h and pings at 00:30 and 01:29, what would the behavior be?
from healthchecks.
But just being able to configure "only turn yellow if 40% of the grace time has elapsed" would probably cover most use- and edgecases.
But it would add a configuration setting, that would need to be tucked in the UI somewhere, and explained in the docs. My suggestion would be to think of the orange status icons not as an error condition, but as a sign a particular check will run soon. Same as with traffic lights where orange means "the light will change soon".
Question regarding grace time: If I have a Job scheduled for 00:00, a grace time of 1h and pings at 00:30 and 01:29, what would the behavior be?
Assuming the check is initially up,
- at 00:00 the check's grace period will start, and the icon will change to orange
- at 00:30, after receiving a ping, the icon will change back to green
- after that, the next expected ping is at the next midnight. Any early pings (e.g. at 01:29) does not affect the status, it will stay green.
from healthchecks.
Ok thanks
from healthchecks.
Related Issues (20)
- Unable to use different domains for web GUI and pings? HOT 2
- [Docker] replace pip with apt HOT 1
- Feature request: High-Availability HOT 2
- Alert gets sent even though check is OK HOT 2
- SITE_ROOT in local_settings.py HOT 2
- New type of API key: read-only, but returns check and channel UUIDs HOT 3
- gotify integration should allow to set priorities
- Slack legacy webhook integration HOT 2
- Mute all checks while updating HOT 3
- Get informed if job run time is too short HOT 2
- Hi, how to fix this error ,please help me, whenever user register it gives: SMTPAuthenticationError at /accounts/register/ (535, b'Incorrect authentication data')
- [docker] .env includes DEBUG=False yet banner still states "Running in debug mode, do not use in production." HOT 2
- Slack integration - default integration name to channel name HOT 1
- Unexpected "down" after sending ping HOT 2
- Read only user can create project, can we have "true" read only users ? HOT 3
- Discord Webhook integration HOT 6
- Return UUID in "List Existing Checks" response
- check display
- Allow use of slugs for E-Mail pings
- Notifications not working? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from healthchecks.