When should I create an incident vs. just answering inbound questions?
The short answer: as soon as you’re confident an issue is real and could affect more than one customer, create the incident. The cost of a public incident is much lower than the cost of customers feeling left in the dark.
The simple test
Ask yourself: Will I have to give the same answer to more than one customer?
If yes — create the incident. Posting a public update is faster than copy-pasting a status into ten separate replies, and customers get the same information at the same time without having to ask.
If no — handle it inside the affected conversation.
When NOT to create an incident
Plenty of issues don’t warrant one:
- A single customer’s account-specific problem — wrong permissions, missing data, billing dispute. These are conversations, not incidents.
- A bug you’ve confirmed only affects edge cases that haven’t surfaced publicly. Fix it, don’t broadcast it.
- An issue you’re not yet sure is real — wait until you have enough evidence to write a credible title. Posting “we are investigating reports of an issue” when you don’t yet know if the issue exists creates panic.
- An internal problem the customer can’t see — an internal tool is down, but customer-facing functionality is fine. Track it internally.
When you should definitely create one
- Anything visible from outside — slowness, errors, missing features.
- Anything that’s caused at least one inbound support conversation — if one customer asked, others will.
- Planned maintenance windows that will take a customer-facing feature offline. Create with status Monitoring or Identified, mention the duration, and set the affected components to
maintenance.
The half-resolved case
A common pattern: an issue happens, your team fixes it within 5 minutes, the incident never went visible enough for customers to notice. Should you backfill an incident?
Default to yes when you’re unsure. A short retrospective incident — “Brief outage of the X service from 14:02 to 14:07 UTC. Root cause: Y. Resolved.” — signals operational maturity to customers reading your status page history. It costs almost nothing to create.
The exception: completely internal hiccups (a deploy that auto-rolled back without affecting anything customer-facing). No need.
What about minor degradations?
If a component is degraded for an extended period but everything still works, your call:
- Set the component status to
degradedwithout creating an incident. The status page shows the degradation; customers who look will see it. - Create an incident if you want subscribers to be notified and if you’ll be posting updates as you work through it.
Most teams err on the side of creating the incident for anything lasting more than ~10 minutes.