Skip to content

Dependencies & inhibition

When the server room switch fails, 50 servers behind it are unreachable. Without inhibition you'd get 50 emails — but the actual problem is exactly one.

Dependencies model „A depends on B"; inhibition suppresses alerts on A while B itself is critical.

Model

graph TD
    SW[sw-core: CRITICAL] -->|hangs on| WEB1[web01: CRIT inhibited]
    SW -->|hangs on| WEB2[web02: CRIT inhibited]
    SW -->|hangs on| DB[db01: CRIT inhibited]
    SW -.continues to.-> ROUTER[router: OK]
    DB -->|hangs on| APP[app-server: CRIT inhibited]

When sw-core is CRIT, all children are marked „inhibited". Their statuses appear with a gray badge in the UI, their alert rules don't fire.

Dependency model

In dependencies (migration 036):

Field Meaning
parent_host_id parent host
child_host_id dependent host
parent_service_id optional: relationship at service level
child_service_id optional
inhibit_when_parent status from which inhibition kicks in (default CRITICAL)

A relationship can be „only inhibit on parent CRIT" or „also on WARN".

Create

/dependencies shows a tree view. Per parent, + Add dependency:

  1. Pick child host
  2. Optional: specific service relationship instead of host level
  3. Set inhibit_when_parent

Bulk: hang multiple children at once.

Inhibition logic in worker

flowchart LR
    R[Check result CRIT] --> Q{Parents exist?}
    Q -->|no| FORWARD[normal evaluation]
    Q -->|yes| P{Parent status >= inhibit threshold?}
    P -->|no| FORWARD
    P -->|yes| INHIB[Suppress alert<br/>Status visible as inhibited]

Inhibition only applies to notifications — the status itself isn't masked. In the UI you see the service is critical, but inhibited because of sw-core.

Inhibition + recovery

When the parent recovers:

  1. Inhibition lifts
  2. If child is still CRIT, a notification is generated now — actual follow-up problems become visible
  3. If child also recovered through parent's restoration, all quiet (no alert spam)

Tree view

sw-core (CRITICAL)
├── web01 (CRIT, inhibited)
├── web02 (CRIT, inhibited)
└── db01 (CRIT, inhibited)
    └── app-server (CRIT, inhibited transitively)
router-edge (OK)
└── (no dependencies)

Transitive inhibition: if db01 is inhibited (because sw-core is CRIT), and app-server depends on db01, then app-server is also inhibited.

Detection of parent hosts

Manual — no automatic detection. Recommendation:

  • Discovery walks note the default gateway per host (when available) — that's the standard parent suggestion
  • For storage volumes: storage host is the parent
  • For VMs: VM host is the parent

Auto-population is on the roadmap — manually maintaining the most important parents is enough in practice.

Edge cases

Soft-deleted parent

Since v0.17.x

Inhibition ignores orphan / soft-deleted parent hosts — otherwise deleted switches would suppress follow-up alerts forever.

When the parent is deleted, inhibition lifts automatically — follow-up alerts fire again.

Parent in downtime

If the parent is in a downtime, it's not counted as CRIT — all children see it as „not inhibiting". Sensible: during planned maintenance on the switch, we don't ask „how many hosts hang on it" but watch them as normal.

Circular dependency

Backend rejects cycles (A → B → A) with 400 on creation.

Next

  • Alert rules — how inhibition plays with rules
  • Downtimes — alternative approach for „silent during maintenance"