Running & Interpreting Health Checks
Health checks monitor the state of AppProfileSafe's infrastructure subsystems. They can be run on demand from the Dashboard and produce a JSON snapshot that is persisted to the report folder for historical comparison. Each check returns a status level and a detail message, and pipeline metrics are included in every snapshot.
How to Run Health Checks
- On the Dashboard, locate the Health status tile.
- Click Diagnose to open the "System Health" window.
- The window displays the overall status and a table of all check results (Name, Status, Message).
- Optionally, click Export Diagnostics to create a diagnostics bundle that includes the health snapshot.
Health checks are run each time the System Health window is opened. The results are also included in the diagnostics bundle under health/health-current.json.
Available Checks
| Check Name | What It Verifies | Ok | Warn | Fail |
|---|---|---|---|---|
| DiskSpace | Free disk space on the drive used by AppProfileSafe | ≥ 5 GB free | 1–5 GB free | < 1 GB free |
| EventQueue | Event pipeline queue and dead-letter state | No dead-letters, ≤ 100 pending | Dead-letters present or > 100 pending | — |
| SIEM | Network connectivity to the configured SIEM server | TCP/TLS connection successful | — | Connection failed (DNS, timeout, TLS error) |
| Webhook | Reachability of all configured webhook endpoints (HTTP HEAD) | All endpoints reachable | Some endpoints unreachable | All endpoints unreachable |
When a subsystem is not configured (e.g. SIEM inactive, no webhook endpoints), the check returns Skip with a descriptive message instead of triggering a failure.
Status Levels
Each individual check returns one of these statuses:
| Status | Meaning |
|---|---|
Ok |
Check passed, subsystem is healthy |
Warn |
Check passed but with concerns — action may be needed |
Fail |
Check failed — subsystem is in an unhealthy state |
Skip |
Subsystem not configured — check was not executed |
The overall status is computed from all checks: Healthy (all Ok/Skip), Degraded (any Warn), or Unhealthy (any Fail). If a check throws an unexpected exception, it is caught and reported as Fail with the exception message.
SIEM Health Check Details
When SIEM is active, the check performs a real TCP connection to the configured host and port. If TLS is enabled, it also validates the SSL handshake and reports the remote certificate's expiration date and TLS protocol version. The check reports the connection latency in milliseconds.
Webhook Health Check Details
The webhook check sends an HTTP HEAD request (with a 5-second timeout) to each configured endpoint. It reports the number of reachable vs. unreachable endpoints. Individual endpoint failures include the URL and either the HTTP status code or the exception message.
Pipeline Metrics
Every health snapshot includes aggregate pipeline metrics alongside the check results:
| Metric | Description |
|---|---|
EventsDispatched |
Total events that have entered the pipeline (pending + delivered + dead-lettered) |
EventsDelivered |
Events successfully delivered to at least one sink |
EventsFailed |
Events with permanent sink failures |
EventsDeadLettered |
Events currently in the dead-letter queue |
Health Snapshot File
Each time health checks are run, the results are persisted as a JSON file in the report folder: health_YYYYMMDD_HHmmss.json. The snapshot includes the schema version, generation timestamp, machine context (MachineName, UserName, Domain), all check results with latencies, the overall status, and pipeline metrics. These files are also included in the diagnostics bundle.