Failures by Category
Colored by severity. Citation and stale-source errors dominate.
Failure Heatmap
Failure counts by domain and category.
| Domain | Incorrect citation | Stale source | Unsupported claim | Conflicting documents | Partial context |
|---|---|---|---|---|---|
| Finance | 4 | 3 | 2 | 3 | 2 |
| Compliance | 3 | 4 | 4 | 2 | 1 |
| Security | 2 | 1 | 2 | 1 | 2 |
| Legal | 3 | 1 | 1 | 1 | 1 |
| HR | 1 | 2 | 0 | 0 | 0 |
| Support | 1 | 0 | 0 | 0 | 0 |
Root Cause Insights
The three highest-volume failure patterns and how to resolve them.
Incorrect citation
High14 failures · 23% of total
Why: Citations map to topically related but non-supporting chunks; no claim-to-evidence overlap check.
Fix: Require token/semantic overlap between claim and cited span before surfacing a citation.
Stale source
High11 failures · 18% of total
Why: Retired document versions (Travel v2.7, AI Governance v1.3) remain in the index and outrank current versions.
Fix: Add freshness metadata and down-weight or exclude superseded versions at retrieval time.
Unsupported claim
Critical9 failures · 15% of total
Why: Model compresses multi-condition policies into a single condition, producing confident but unsupported claims.
Fix: Enforce claim-level grounding and refuse partial answers on critical-risk policies.
Top Failing Documents
Source documents responsible for the most failures.
| Document | Version | Failed Queries | Dominant Failure Mode | Risk | Recommended Fix |
|---|---|---|---|---|---|
| Employee Travel Policy | v2.7 (retired) | 8 | Stale source conflicting with v3.2 | High | Remove retired version from the index or hard-filter by effective date. |
| AI Usage Governance Standard | v1.3 | 7 | Ambiguous multi-condition sections; unsupported claims | Critical | Revise ambiguous sections and add structured condition lists for grounding. |
| Finance Approval Matrix | v1.6 | 6 | Unclear approval thresholds across tiers | Critical | Clarify tier thresholds and chunk the matrix so each tier is independently retrievable. |
| Travel Policy Addendum | v1.4 | 4 | Overlapping reimbursement guidance with main policy | Medium | Merge the addendum into v3.2 or mark precedence explicitly. |