Admin Alerts
UTM writes structured error rows to the audit log whenever a critical trading function fails. The audit log is the forensic record. Admin alerts are the page: when an error row of a known shape lands, UTM emails the admin recipient list and posts a webhook so an operator hears about the problem before the next daily audit.
This page describes what fires an alert, how to configure the recipients, and how the dedup window stops a burst of failures from paging repeatedly.
When an alert fires
Four alert types map to four call sites. Each alert carries a type,
severity (HIGH or CRITICAL), a one-line title, a longer
message, and a details object with the entity ids relevant to the
failure.
| Type | Severity | Trigger |
|---|---|---|
AUTO_CLOSE_FAILURE | CRITICAL | The auto-close worker could not place a close order for a tracked trade at market close, or the post-close discrepancy check found a position that did not flatten after the settle window. |
SYNC_FAILURE | HIGH | The periodic sync worker failed an entire user iteration (not a transient single-account flake). |
ORDER_FAILURE | HIGH | A non-retryable broker rejection landed an order in FAILED. Normal CANCELLED and EXPIRED order lifecycle does not trigger this. |
CREDENTIAL_ERROR | HIGH | A broker account transitioned to credentialStatus=invalid after the circuit-breaker threshold (three consecutive 401s). |
The details object is always populated with the relevant ids
(tradeId, strategyId, accountId, brokerId, userId, orderId)
so a webhook recipient can deep-link straight to the affected entity.
Configuration
Recipients and the webhook URL live in SystemSetting so they can be
changed from the admin Monitoring UI without a redeploy. They are not
environment variables.
Three keys drive the system:
| Key | Type | Default | Purpose |
|---|---|---|---|
admin_alert.enabled | boolean | false | Master switch. When false, dispatch is a silent no-op regardless of the values below. |
admin_alert.email_recipients | string | "" | Comma-separated email addresses. Leave empty to disable email dispatch. |
admin_alert.webhook_url | string | "" | Single webhook URL. Slack, Discord, and PagerDuty all accept the JSON body that UTM posts. Leave empty to disable webhook dispatch. |
Set the recipients first, then flip admin_alert.enabled=true. The
default false prevents the system from blasting an unconfigured
mailbox the moment the migration runs.
Email dispatch
Email goes through the same notification email path used elsewhere in
UTM (SMTP or Resend, whichever is configured under Email
Configuration). Each recipient gets one email per alert. Subject prefix
is [UTM CRITICAL] or [UTM HIGH] followed by the alert type and
title.
Webhook dispatch
Body is a JSON object:
{
"type": "AUTO_CLOSE_FAILURE",
"severity": "CRITICAL",
"title": "Auto-close failed: AAPL",
"message": "Worker could not place close order for trade ...",
"details": {
"tradeId": "...",
"strategyId": "...",
"errorMessage": "..."
},
"timestamp": "2026-05-19T20:00:00.000Z"
}
A 5xx response triggers one retry. A 4xx response is treated as
non-retryable. When both attempts fail (or a 4xx lands), UTM writes an
alert.webhook_failed row to the audit log at severity=error with
the full payload so the alert is never fully lost.
Send test email
The Admin Alerts card has a "Send test email" button next to the email recipients field. It exercises the same email path a live page uses, so you can confirm deliverability for a specific address without waiting for a real failure to fire.
- Type a single address into the recipients box and click "Send test email" to target just that address. Leave the box on the saved list (or type a comma-separated list and save it first) to test every configured recipient.
- The result shows per recipient: delivered with the provider message id, or failed with the provider error. The message id is the handle you use to look the send up in the Resend dashboard when chasing a provider-side drop.
- Every attempt persists one
alert.test_emailrow to the audit log:severity=infoon success,severity=erroron failure, with the recipient, message id, and error inmetadata. This is the in-product delivery evidence, so a one-off miss is reproducible after the fact.
The endpoint is POST /api/v1/admin/alerts/test and requires the
admin:write scope. The body is an optional { "email": "..." }; when
omitted the configured admin_alert.email_recipients list is used. The
response is a recipients array of per-recipient results. Email
delivery uses the active provider (SMTP or Resend) configured under
Email Configuration, so configure a provider first.
Dispatched audit row
Every successful dispatch writes one alert.dispatched row to the
audit log at severity=error under category=admin. This is the
error-tier trail the daily audit grades, so an out-of-band page is no
longer invisible to the audit's error sections. The row carries the
alert type, severity, title, dedupKey, and full details in
its metadata, and links the affected userId when one is present.
The write has its own guard, so a logging failure can never break the
page itself.
Naming the affected user
When the details carry a userId, UTM resolves it to an email and
folds an ownerEmail field into the details before dispatch, so the
email body, the webhook payload, and the dispatched audit row all name
the person rather than showing a bare UUID. The lookup is best-effort:
a missing user or a database blip leaves the alert untouched and the
page still fires.
Dedup
The same (type, entityId) is suppressed for 15 minutes after a
successful dispatch. Practically, a single stuck trade pages once at
4:01 pm and not 100 times as the auto-close worker keeps retrying.
Dedup state is per-process and in-memory by design. A process restart clears the window, which is desirable: a restart often means an operator is already paying attention, and one extra page on restart-with-still-stuck-state is safer than swallowing the signal.
Disabled or unconfigured
When admin_alert.enabled=false or both email_recipients and
webhook_url are empty, dispatch is a no-op. UTM writes one
alert.skipped row to the audit log per process so the silence is
visible during a triage. Repeated skips in the same process do not
re-emit the row.
Reading the audit log
Both the dispatch path and the fallbacks land in the audit log under
category=admin. Useful filters:
| Action | Meaning |
|---|---|
alert.dispatched | An alert was sent. Written at severity=error with the full alert metadata. |
alert.skipped | The system was disabled or unconfigured when an alert tried to dispatch. |
alert.webhook_failed | The webhook failed both attempts. Body is in metadata. |
alert.test_email | A test email was sent from the admin tool. info on success, error on failure; recipient, message id, and error are in metadata. |
What this is not
This system is the operator page for trading-critical failures. It is not a replacement for:
- Per-user notifications (those still fire to the affected user via the in-app and push channels).
- Production observability (Sentry, OpenTelemetry traces, log aggregation).
- An on-call rotation. The webhook URL is a single endpoint; PagerDuty or similar handles rotation downstream.