Acknowledgements¶
An acknowledgement (ACK) means: „I've seen the problem, no more notifications, but the status stays truthfully CRIT until fixed."
Compared to downtime: ACK isn't an expectation („this is maintenance"), but a reaction („I'm dealing with it now").
Setting an ACK¶
Three paths:
| Where | How |
|---|---|
| Web UI | Host detail / error overview → service row → Ack |
| Mobile app | Host detail → service → Ack button |
| Notification channel | Slack/Teams have an „Ack" button right in the message |
Required field: Comment — what's happening, who's on it.
Effect¶
- Further notifications of this alert rule are suppressed
- Escalation pauses
- Status visible in UI with ACK badge + author + comment
- SLA: ACK does not count as downtime — the service is still in outage and counts
Auto-clear on recovery¶
When the service is OK again:
- ACK is cleared automatically
- Status to OK
- Optional recovery notification (configurable per channel)
What about a new hard state¶
If an acknowledged service jumps back from OK to CRIT (flapping or new problem):
- Default: ACK stays, no notification
- Configurable per alert rule: „auto-clear ACK on new hard after recovery" → flows through as new wave
Variant 2 makes sense when post-recovery incidents are independent.
Sticky ACK¶
Optional per ACK: Sticky = stays even after recovery. Useful when the issue recurs and you're waiting for a bigger fix (hardware swap in 2 weeks).
Default: not sticky.
ACK + downtime¶
Both can apply at once. ACK is operator action, downtime is planned maintenance. A service can be acknowledged while a downtime is laid over.
ACK in bulk¶
In error overview: select multiple services → Ack all with shared comment. One ACK per service with same comment.
Revoke ACK¶
Service detail → Remove ack. Status stays CRIT, notifications resume, escalation restarts (stage 1).
Audit¶
ACKs are logged with track_change — author, service, comment, time. Filter in audit log: action = ack.create / ack.clear.
Permission¶
Permission service.ack (default for operator+). Without it, the Ack button is disabled.
Next¶
- Alert rules — escalation logic and ACK behavior
- Downtimes — distinction