PATTERN Cited by 1 source
Alarm aggregation per entity¶
Once detections pass validation + become alarms, do not forward every new detection as a new alert. Roll up per entity (camera + use case + zone), auto-close on resolution, and escalate only on SLA exhaustion.
Shape¶
- Aggregation key: (entity, use-case). In the canonical CV-safety case, the entity is a camera + zone; the use case is the specific risk class (PPE, Housekeeping).
- Append, don't recreate: when a new detection arrives for an already-open risk, append to the existing record, do not open a new one. A Lambda fired by SNS+SQS on each risk-write looks up the open record for the key and extends it.
- Auto-close on absence: a scheduled (e.g. 1-minute) EventBridge timer runs a Lambda that checks whether the risk still appears in the latest camera images for each open record. If absent, the record closes automatically — no operator action needed for transient violations.
- Escalation on SLA exhaustion: a second scheduled function checks elapsed-time-since-open against per-severity SLAs; on breach, it notifies through zone-configured preferred channels (Slack / email / ticket) with escalation levels so the right people are alerted based on severity + duration.
Why it works¶
- Alert fatigue is the enemy: "Instead of bombarding safety teams with duplicate notifications, the system appends new occurrences to existing open risks." One violation event, not 60 per minute.
- Auto-close on absence is critical — without it, stale transient risks persist and train operators to ignore all of them.
- Escalation levels acknowledge that severity + duration are orthogonal; both matter to routing.
- Channel flexibility per zone allows local ownership (Slack for on-floor ops, email for management review, ticket for audit).
Required machinery¶
- Per-entity state store with the (entity, use-case) → open risk record lookup (DynamoDB canonical).
- Two scheduled-Lambda paths: resolution-check + SLA-check.
- Per-zone notification-channel configuration.
- Per-severity SLA + escalation policy.
Seen in¶
- sources/2026-04-01-aws-automate-safety-monitoring-with-computer-vision-and-generative-ai — canonical wiki instance. "This function intelligently aggregates risks per camera per use case to avoid alert fatigue." Every minute a Lambda checks whether risks still appear in latest camera images; absent → auto-close. A separate scheduled function monitors SLA exhaustion + notifies through zone-preferred channels with escalation levels.
Related¶
- concepts/alert-fatigue — named failure mode.
- patterns/multilayered-alarm-validation — what runs before aggregation; only validated alarms get aggregated.