Create manual incidents

Most incidents are auto-created by monitors. Sometimes they aren’t, and you want one anyway. This page covers the patterns where manual creation is the right move, and the cleanest way to do it.

When to create one manually

A third-party dependency broke something user-visible. Stripe is degraded, AWS is throttling, your CDN is intermittently 502-ing. Your monitors might be showing a partial signal — but you have first-hand evidence.
A bug surfaced in production that no monitor catches. UI broken for a specific browser, auth flow broken only for SSO users.
Coordinating a multi-team outage. One canonical incident beats five Slack threads.
Pre-announced maintenance went sideways. A deploy you’d protected with a maintenance window actually broke something — the maintenance window suppressed the auto-incident, so create one manually.

When not to

Don’t manually create an incident for an outage a monitor will catch in 60 seconds anyway. The auto-incident will be cleaner — fewer race conditions with responder_status and affected_http_job_ids.
Don’t create an incident as a stand-in for a Slack message. If you don’t intend to publish it on a status page or trigger paging, just message the team directly.

Two flavors of manual incident

Standalone incident (no status page)

Use this for internal-only tracking. Useful if you want the timeline + paging behavior without publishing.

curl -X POST https://api.siteqwality.com/incident \
  -H "Authorization: Bearer $SITEQWALITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "title": "Stripe webhooks intermittently failing",
    "severity": "major",
    "message": "Seeing ~5% failure rate on inbound Stripe webhooks. Investigating."
  }'

Status-page incident (published)

Use this when the outage affects customers and they should know.

curl -X POST https://api.siteqwality.com/status_page/$STATUS_PAGE_ID/incident \
  -H "Authorization: Bearer $SITEQWALITY_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "status_page_id": "<status-page-id>",
    "title": "Some webhook deliveries delayed",
    "severity": "minor",
    "message": "We are seeing increased latency on outbound webhooks. Investigating with our delivery provider.",
    "affected_http_job_ids": []
  }'

The incident appears immediately on the public page.

Linking to affected monitors

Even on manual incidents, set affected_http_job_ids if monitors do exist for the broken thing. This makes the incident appear as a banner on the matching status-page components, and groups it correctly in dashboards.

{
  "affected_http_job_ids": [
    "a3c1f5e0-1234-4abc-9def-aaaaaaaaaaaa",
    "b7d2e601-5678-4def-9abc-bbbbbbbbbbbb"
  ]
}

Severity guidance for manual incidents

Use case	Severity
Documentation site down	`minor`
Login broken for a subset of users	`major`
Full outage, billing affected, security event	`critical`

The severity is mostly informational — it drives display order, color, and any custom filters. It doesn’t change escalation behavior on its own; for that, use a different notification group.

Following up

After resolving, especially for major and critical incidents:

Run a postmortem outside SiteQwality.
Link the postmortem doc in a final incident update so it’s preserved on the timeline.