Docs » Detectors and Alerts » Using Built-in Alert Conditions » Heartbeat Check

Heartbeat Check 🔗

What this alert condition does 🔗

Alerts when a signal has not reported for some time. This could happen because a host is down or stopped reporting a particular metric.

Note

Only active metric time series are monitored, so this condition will not trigger an alert for a host that has never sent a metric. It triggers only if a host has sent metrics and then stops sending metrics.

When to use this alert condition 🔗

This condition is often used in tandem with another detector, to ensure that a signal that is being analyzed is reporting.

Example 🔗

You have a detector that alerts you when the minimum number of logins being handled by each host goes below a specified value. If any host stops reporting, that detector would not be triggered if there was a problem. The Heartbeat Check condition would notify you if a host stopped reporting.

Settings 🔗

PARAMETER VALUES USAGE NOTES
Hasn’t reported for Integer >= 1, followed by time indicator (s, m, h, d, w), e.g. 30s, 10m, 2h, 5d, 1w How long it’s been since the signal last reported. Longer time periods result in lower sensitivity and potentially fewer alerts. If you specify a value for Group by (below), how long it’s been since any member of the group stopped reporting.
(optional) Group by Dimension or property chosen from dropdown menu Use a dimension or property when you want alerts to be based on a specified unit. For example, if you group by cluster, the alert will be triggered only if all hosts in a cluster stop reporting. Alternatively, if each time series is associated with only one host, and you you want to be alerted when any host has stopped reporting, leave this parameter blank (or group by host).

Further reading 🔗

PARAMETER Remarks
Signal (heartbeat metric)

If you want to avoid triggering alerts based on specific conditions (e.g. excluding a test realm, or excluding hosts known to have been terminated), apply filters to the signal before configuring the alert condition.

Make sure that the Extrapolation policy is Null (the default) for all signals that influence the heartbeat metric. If it is not Null, SignalFx will extrapolate values for missing datapoints, and the alert will not be triggered as expected. Extrapolation policy is specified in the plot configuration panel for each signal.

Hasn’t reported for To avoid flappy alerts that are triggered due to minor, short-lived delays in sending metrics, this parameter should be significantly larger than the native resolution of the signal (how often the signal is reporting). For example, if the signal reports once a minute, setting this parameter to 10 minutes means that the alert will not be triggered until 10 datapoints have not reported.