Number7AI — Docs
Extraction failure modes taxonomy
The critical question is not whether an IDP ever fails — it always does on some document. The question is whether failures are silent (wrong output posted as correct) or visible (flagged, routed, corrected). This taxonomy maps the major classes.
Last updated: April 2026
TL;DR
- AP failures split into four classes: structural, contextual/numeric, identity, and boundary.
- The highest-risk class is silent wrong output — numbers in the right position but wrong semantic field.
- Vanity accuracy metrics (e.g. 99% headline accuracy) mask silent error risk. ERP-safe pass rate is the real number.
- Good systems route every failure to auto-correct, human review queue, or explicit rejection — never silent posting.
Why we publish this taxonomy
No production system is 100% on messy real-world PDFs. Publishing failure classes with observed residual ranges lets you size exception staffing, SLAs, and audit sampling before go-live—instead of discovering risk at month-end. Undetected silent failures are the category we architect against; everything below is designed to land as auto-correct, typed exception, or reject-with-context.
Why silent failures matter more than obvious ones
When a system clearly fails — blank output, parse error, obvious garbage — a reviewer catches it immediately. The dangerous failures are confident wrong answers: unit price and quantity transposed, tax line dropped, vendor identity mismatched to a ghost record. These pass through automated queues and surface only at period close, audit, or when a vendor chases payment.
Designing for failure visibility is more important than headline accuracy. A system that flags its own uncertainty is safer than one that is confidently wrong.
Category 1 — Structural failures
The document’s visual structure does not match the extractor’s assumptions. These errors often look like “low OCR confidence” but are actually wrong table geometry.
- 1a
Multi-row line items
A single product entry spans two or more rows (long descriptions, or code and description split across rows).
- Typical generic IDP failure
- Each row is treated as a separate line item. The continuation row becomes a phantom line—often blank quantity/price—doubling line count and corrupting subtotals.
- Handling principle
- Row-continuation detection: rows without numeric closure merge upward before line extraction. Uncertain merges are confidence-flagged.
- Observed residual
- ~1.5% residual when product descriptions exceed ~60 characters.
- 1b
Multi-page line item tables
Line items continue on page 2+ without repeating the table header—common on long Indian GST invoices.
- Typical generic IDP failure
- Page-at-a-time processing loses column identity on continuation pages; rows are dropped or mapped to wrong fields.
- Handling principle
- Page-continuity: if page N ends mid-table and N+1 rows match N’s column structure, the table is reconstructed before extraction.
- Observed residual
- ~2% when scan quality differs across pages.
- 1c
Nested tables
A line contains an inner grid (e.g. freight legs, GST splits inside a line cell).
- Typical generic IDP failure
- Sub-cells flatten into one string in the parent cell—structure needed for ERP line posting is lost.
- Handling principle
- Recursive sub-table detection; each level extracted and re-attached to the parent line object.
- Observed residual
- ~3–4% on deeply nested (3+) structures.
Category 2 — Contextual numeric failures
Digits are present and roughly in the right region, but encoding is ambiguous. This is where “high OCR accuracy” still produces wrong books.
- 2a
Indian notation and rupee conventions
Lakh-grouping (`1,20,000` → 120,000), `4200/-` round-rupee suffix, `₹4,20,000` with symbol.
- Typical generic IDP failure
- Western comma rules or stray `/` produce wrong magnitudes or rejected imports.
- Handling principle
- Locale hinting from vendor GSTIN state code; dual-parse Western vs Indian; normalize `4200/-` → `4200.00`; flag if both parses fail.
- Observed residual
- <0.5% with legible GSTIN; higher if GSTIN missing or unreadable.
- 2b
Time-format artifacts in price fields
Legacy accounting exports sometimes emit colon-separated tokens (e.g. `00:80` meaning 0.80) in price columns.
- Typical generic IDP failure
- String ingested into ERP as a non-numeric or wrong magnitude—often uncaught until reconciliation.
- Handling principle
- Range and line-math consistency checks; colon patterns interpreted as decimal only when totals don’t close otherwise—then confirm or flag.
- Observed residual
- ~5% on invoices from specific legacy generators until vendor pattern is learned.
- 2c
Tax expressed as rate only
Invoice states “GST 18%” without rupee tax lines; taxable base must be inferred.
- Typical generic IDP failure
- Rate extracted, amount blank—ERP import fails—or wrong base picked (pre- vs post-discount).
- Handling principle
- Compute tax from taxable lines when unambiguous; hold for human when discount or partial exemption structure is unclear.
- Observed residual
- ~8% when discount structure is complex.
- 2d
Mixed date formats
`01/02/26` may be DD/MM/YY or MM/DD/YY depending on vendor locale.
- Typical generic IDP failure
- Wrong voucher period, wrong due-date aging, compliance mismatches.
- Handling principle
- Vendor history + explicit format signals; conflicting parses never auto-post.
- Observed residual
- Queued to semantic pass; rate depends on vendor corpus.
Category 3 — Vendor identity failures
The vendor on the page does not map 1:1 to the finance master. Silent auto-match here creates duplicate vendors and broken SOA.
- 3a
Trade name vs legal entity
Header shows trade name; remittance or GSTIN corresponds to a different legal name on master.
- Typical generic IDP failure
- Duplicate vendor creation or repeated exception touches every month.
- Handling principle
- GSTIN-first match to legal master; fuzzy name only when GSTIN validates.
- Observed residual
- ~8% utility invoices; ~12% sole-prop trade-name cases (flagged, not auto-split).
- 3b
Vendor assignment inside bulk PDFs
A long PDF contains multiple vendors; each boundary needs its own vendor resolution.
- Typical generic IDP failure
- Vendor from invoice K bleeds into invoice K+1 when boundaries slip.
- Handling principle
- Vendor identity resolved per invoice segment after boundary cut—never carried blindly across boundaries.
- Observed residual
- Correlated with boundary accuracy (~1–3% when boundaries uncertain).
- 3c
GSTIN presentation variants
Spaces, hyphens, lowercase in GSTIN block.
- Typical generic IDP failure
- False mismatch to master despite same legal entity.
- Handling principle
- Canonical normalization before compare; single-digit transpositions route to identity exception (separate from duplicate scoring).
- Observed residual
- Low when GSTIN OCR is clean; otherwise semantic queue.
- 3d
Remit-to vs bill-from divergence
Invoice from entity A; payment or PO references entity B—no single-row signal.
- Typical generic IDP failure
- Wrong bank / wrong intercompany mapping if forced auto-post.
- Handling principle
- Policy-based hold: requires explicit master rule or human choice—never silent default.
- Observed residual
- Exception-only; frequency varies by vendor contract complexity.
Category 4 — Bulk PDF boundary failures
Single-file, multi-invoice logistics: cover letters, duplex scans, and same-vendor batches stress naive page counters.
- 4a
Cover / transmittal misclassified as invoice
Page 1 is a letter or packing list with no numeric invoice skeleton.
- Typical generic IDP failure
- First “invoice” is garbage; all following pages shift—silent structural cascade.
- Handling principle
- Page classification before boundary detection; “cover” pages excluded from invoice start set.
- Observed residual
- Rare once classified; high impact if missed.
- 4b
Duplex / back-page continuation
Back of invoice K continues line items; front of K+1 starts next invoice—boundary sits between unlike faces.
- Typical generic IDP failure
- Continuation lines attached to wrong invoice or split across two records.
- Handling principle
- Two-pass boundary: detect invoice fronts, then classify following pages as continuation vs new front; ambiguous splits show both parses to reviewer.
- Observed residual
- Depends on scan batch quality.
- 4c
Identical vendor, sequential invoices
Same layout and vendor; only invoice number, date, and lines change.
- Typical generic IDP failure
- Models merge adjacent invoices into one inflated total.
- Handling principle
- Invoice-number change is a hard boundary trigger—even if other continuation signals fire.
- Observed residual
- Low when number OCR is reliable; flagged when number read is ambiguous.
Observed residual failure rates
Rates from production AP workflows on Indian documents. Residual means after IDP processing — these are the failures that reach the exception queue or (worst case) posting.
| Failure class | Observed residual range | Primary trigger |
|---|---|---|
| Multi-row table continuity | ~1.5–2% | Long descriptions, uneven scan quality |
| Nested table semantics | ~3–4% | 3+ level nested structures |
| Locale / format numeric | <0.5% → ~5% | Missing GSTIN or legacy format artifacts |
| Tax amount inference | ~8% | Rate present but base amount unclear |
| Vendor identity mismatch | ~2–5% | Trade name vs legal entity, GSTIN formatting |
| PDF boundary confusion | ~1–3% | Cover pages, same-vendor adjacency |
How AIdaptIQ routes exceptions
Every document outcome falls into one of three paths — nothing is silently posted if a failure is detected:
Auto-corrected
Known patterns (lakh notation, rupee symbols, GSTIN formatting) normalized and logged in the audit trail with correction reason.
Flagged for review
Low-confidence fields highlighted in the review UI with context. Human corrects and approves before any ERP push.
Rejected with context
Document returned with a specific error explanation — not a generic failure. Reviewer knows exactly what to fix on resubmit.
Related reading
- Read
Competitor analysis
Which vendors produced which failure types on our real test set.
- Read
Origin story
Why we started building after testing every IDP on messy invoices.
- Read
Benchmarks
STP rate, error volume reduction, and cycle-time metrics.
- Read
OCR vs. IDP
Why detection and validation matter more than raw OCR accuracy.
- Read
Three-pass reconciliation
Where syntactic checks catch failures that OCR scores miss.