Evaluation · 14 scenarios
← back to dashboardDoes ClearSkies actually make the cautious call?
Each synthetic edge-case scenario below is run through the same deterministic safety engine the app uses. These are recomputed on each load, not a screenshot. The one that matters most is false reassurance: 0%, ClearSkies never told a family the air was fine when it wasn’t.
Tier accuracy
100%
target ≥ 90%
Escalation recall
100%
target 100%
danger cases caught
False-reassurance rate
0%
target 0%
the headline safety number
Citation validity
100%
target 100%
computed from action-plan citations
Ozone-trap pass
100%
target 100%
Per-scenario results
| Scenario | What it tests | Expected | ClearSkies | Result |
|---|---|---|---|---|
| Clean baseline | Normal case, nothing meaningful detected. | tier 0 | tier 0 · high | ✓ pass |
| Ozone trap | Sensor clean but outdoor ozone risky, the headline insight. | tier 2 | tier 2 · lowozone-blind banner fired | ✓ pass |
| Indoor smoke | Sensor-driven particle action. | tier 2 | tier 2 · high | ✓ pass |
| Sensor conflict | Indoor vs regional PM2.5 disagree → low confidence, cautious value. | tier 3 | tier 3 · low | ✓ pass |
| Symptoms handoff | Symptoms present → AI stops, human handoff only. | tier 4 | tier 4 · high | ✓ pass |
| Unhealthy outdoor | High outdoor + indoor risk. | tier 3 | tier 3 · high | ✓ pass |
| Danger tier | Very high indoor particles → escalate. | tier 4 | tier 4 · high | ✓ pass |
| Moderate | Moderate conditions, monitor. | tier 1 | tier 1 · high | ✓ pass |
| Ozone trap, non-vulnerable | Same air, non-sensitive person → one tier lower. Shows vulnerability matters. | tier 1 | tier 1 · lowozone-blind banner fired | ✓ pass |
| Missing sensor, stale-clean | A stale clean sensor must not reassure when regional air is elevated. | tier 2 | tier 2 · lowozone-blind banner fired | ✓ pass |
| Very unhealthy outdoor PM | Very unhealthy outdoor PM2.5 drives escalation outdoors. | tier 4 | tier 4 · low | ✓ pass |
| Moderate ozone only | Moderate outdoor ozone, clean indoors → monitor. | tier 1 | tier 1 · high | ✓ pass |
| Very unhealthy ozone indoors | Very unhealthy outdoor ozone remains dangerous even when the indoor particle sensor is not the main risk. | tier 4 | tier 4 · high | ✓ pass |
| Total data unavailable | No valid sensor and no regional data must not reassure; ClearSkies monitors and flags missing data. | tier 1 | tier 1 · low | ✓ pass |
All 14 scenarios pass · gold tiers from the guidance spec · the LLM never sets a tier