How Hermes turns StatusCake alerts into reliable incident workflows

Most monitoring systems excel at detecting issues but fall short when it comes to managing the response. StatusCake reliably flags potential problems, yet alerts often flood chat rooms, wake the wrong team members, or vanish entirely when sites briefly recover. This creates a critical gap between detection and meaningful action—one that Hermes is designed to fill.

From detection to decisive action

Hermes functions as a lightweight verification and routing layer behind StatusCake’s webhook system. Instead of treating alerts as immediate incidents, it adds a layer of intelligence to filter out noise and prioritize real problems. The workflow follows a clear sequence:

StatusCake identifies a potential issue
Hermes receives the alert via webhook
The raw request is logged for future reference
Hermes independently verifies the status from its network
Only verified incidents trigger escalation
Notifications are sent with a complete event history

This approach prevents the common pitfall of "alert spam" without requiring a full-fledged incident management platform. It’s a balance between simplicity and operational sanity.

Why verification matters in incident response

Not all alerts require the same urgency. A down status during business hours demands immediate attention, while the same alert at 3 AM might wait until morning. Similarly, a monitor that briefly flips to down for ten seconds isn’t always indicative of a real outage. Hermes addresses these nuances by applying context-aware logic to each alert.

In practice, Hermes handles four key functions:

Receives StatusCake webhooks in a structured format
Independently verifies the target’s status from its location
Applies time-of-day routing rules
Maintains a JSONL incident log and sends daily summaries

For small teams, this setup transforms a basic monitoring stack into a more reliable ops workflow without added complexity.

A minimal architecture for maximum control

The entire system is intentionally lightweight. StatusCake sends a webhook to a local receiver running Hermes, which then processes the alert through verification, decision-making, and notification steps. The architecture avoids unnecessary components, making it accessible even for quick deployments.

For teams prioritizing speed, an ad hoc setup using scripts and configuration files is sufficient. Formal packaging as a Hermes skill can come later if needed. The goal is to establish a functional workflow before optimizing further.

Webhooks simplify debugging and reliability

Using webhooks instead of parsing email alerts offers distinct advantages. StatusCake natively supports webhook delivery, preserving the structured payload instead of forcing you to extract data from text. This approach also makes debugging straightforward—you can log the exact headers and body of each request.

When an alert behaves unexpectedly, the first question is often whether StatusCake sent the expected payload. With raw logs of inbound requests, you can answer that immediately, reducing troubleshooting time significantly.

Local receiver, global reach

The receiver doesn’t need to be complex. A small HTTP service running locally is enough to handle StatusCake’s webhooks. The setup involves exposing the receiver via a public endpoint trusted for machine-to-machine communication—for example, a Cloudflare tunnel for demos.

Before testing, verify both the local receiver and public endpoint are healthy:

curl 
curl

A common mistake is testing the webhook path with a browser. Remember, /webhooks/statuscake-alerts only accepts POST requests—use /health for GET checks and actual POST requests for testing.

Log everything—before making decisions

One of the most critical design choices is logging the raw webhook request to disk before any processing occurs. This creates two invaluable artifacts:

var/last-webhook-payload.json for quick inspection
var/incoming-webhooks.jsonl as an append-only ledger

With these logs, you can confidently answer questions that often derail incident response:

Did the webhook arrive at all?
Was the payload malformed or missing tokens?
Did Hermes reject the alert, or was it never delivered?

Without this data, debugging webhook issues becomes guesswork rather than a systematic process.

Verification logic that prevents false alarms

Hermes doesn’t just relay alerts—it validates them. When a down alert arrives, Hermes can independently probe the target to confirm the outage. A basic verification might involve:

Checking a health endpoint if available
Requesting the public URL
Counting failures before escalating
Only raising an alert if the threshold is met

The configuration reflects this logic clearly:

{
  "timezone": "Europe/London",
  "probe_timeout_seconds": 8,
  "min_failed_probes": 1,
  "probe_urls": [],
  "notifications": {
    "immediate": [
      {
        "type": "email",
        "transport": "sendmail",
        "from": "statuscake-hermes@localhost",
        "to": ["myagent@agentmail.com"],
        "events": ["DOWN_CONFIRMED", "UP_CONFIRMED"],
        "subject_prefix": "[StatusCake]"
      }
    ]
  }
}

Leaving probe_urls empty tells Hermes to verify against the website_url from the StatusCake payload. This is the default behavior for monitors carrying their own target URLs.

Time-aware routing without extra tools

Many teams need basic escalation rules without adopting a full paging system. Hermes handles this elegantly through configuration. For example, you can define escalation windows based on time of day:

{
  "escalation_schedule": {
    "windows": [
      {
        "name": "business-hours",
        "start": "09:00",
        "end": "18:00",
        "timezone": "Europe/London"
      },
      {
        "name": "off-hours",
        "start": "18:00",
        "end": "09:00",
        "timezone": "Europe/London"
      }
    ]
  }
}

This allows alerts during business hours to trigger immediate Slack notifications, while after-hours incidents route to on-call engineers or create follow-up tasks for the morning team. The result is a smarter, quieter monitoring stack that respects operational realities.

A foundation for better incident management

Hermes doesn’t replace StatusCake—it enhances it. By adding verification, routing, and logging, it turns raw alerts into actionable incidents. For teams tired of alert fatigue or unsure how to build a custom verification layer, this setup provides a pragmatic path forward.

The architecture remains intentionally simple, making it accessible for quick deployment while offering room to grow. Whether you’re running a single service or scaling a growing infrastructure, Hermes helps ensure that every alert carries meaning—and every incident gets the attention it deserves.

AI summary

StatusCake gibi izleme araçlarından gelen uyarıları otomatik olarak doğrulayan ve gereksiz bildirimleri filtreleyen Hermes aracını keşfedin. Daha akıllı uyarı yönetimi için basit adımlar.