Running & Interpreting Health Checks

Health checks monitor the state of AppProfileSafe's infrastructure subsystems. They can be run on demand from the Dashboard and produce a JSON snapshot that is persisted to the report folder for historical comparison. Each check returns a status level and a detail message, and pipeline metrics are included in every snapshot.


How to Run Health Checks

  1. On the Dashboard, locate the Health status tile.
  2. Click Diagnose to open the "System Health" window.
  3. The window displays the overall status and a table of all check results (Name, Status, Message).
  4. Optionally, click Export Diagnostics to create a diagnostics bundle that includes the health snapshot.

Health checks are run each time the System Health window is opened. The results are also included in the diagnostics bundle under health/health-current.json.


Available Checks

Check Name What It Verifies Ok Warn Fail
DiskSpace Free disk space on the drive used by AppProfileSafe ≥ 5 GB free 1–5 GB free < 1 GB free
EventQueue Event pipeline queue and dead-letter state No dead-letters, ≤ 100 pending Dead-letters present or > 100 pending
SIEM Network connectivity to the configured SIEM server TCP/TLS connection successful Connection failed (DNS, timeout, TLS error)
Webhook Reachability of all configured webhook endpoints (HTTP HEAD) All endpoints reachable Some endpoints unreachable All endpoints unreachable

When a subsystem is not configured (e.g. SIEM inactive, no webhook endpoints), the check returns Skip with a descriptive message instead of triggering a failure.


Status Levels

Each individual check returns one of these statuses:

Status Meaning
Ok Check passed, subsystem is healthy
Warn Check passed but with concerns — action may be needed
Fail Check failed — subsystem is in an unhealthy state
Skip Subsystem not configured — check was not executed

The overall status is computed from all checks: Healthy (all Ok/Skip), Degraded (any Warn), or Unhealthy (any Fail). If a check throws an unexpected exception, it is caught and reported as Fail with the exception message.


SIEM Health Check Details

When SIEM is active, the check performs a real TCP connection to the configured host and port. If TLS is enabled, it also validates the SSL handshake and reports the remote certificate's expiration date and TLS protocol version. The check reports the connection latency in milliseconds.


Webhook Health Check Details

The webhook check sends an HTTP HEAD request (with a 5-second timeout) to each configured endpoint. It reports the number of reachable vs. unreachable endpoints. Individual endpoint failures include the URL and either the HTTP status code or the exception message.


Pipeline Metrics

Every health snapshot includes aggregate pipeline metrics alongside the check results:

Metric Description
EventsDispatched Total events that have entered the pipeline (pending + delivered + dead-lettered)
EventsDelivered Events successfully delivered to at least one sink
EventsFailed Events with permanent sink failures
EventsDeadLettered Events currently in the dead-letter queue


Health Snapshot File

Each time health checks are run, the results are persisted as a JSON file in the report folder: health_YYYYMMDD_HHmmss.json. The snapshot includes the schema version, generation timestamp, machine context (MachineName, UserName, Domain), all check results with latencies, the overall status, and pipeline metrics. These files are also included in the diagnostics bundle.