Number7AI — Docs
Three-pass reconciliation
Extraction accuracy is a necessary but insufficient condition for AP readiness. Reconciliation is the second gate — the set of checks that converts raw extraction output into finance-grade, export-ready data.
Last updated: April 2026
TL;DR
- •Pass 1 (Syntactic): arithmetic and unit coherence — catches math errors and tax miscalculations.
- •Pass 2 (Semantic): vendor identity, PO/GRN matching, and duplicate detection.
- •Pass 3 (Pattern): anomaly scoring against vendor history and fraud-risk signals.
- •Any pass failure routes to an exception queue — it never silently passes.
Why extraction alone is not enough
Most IDP platforms stop at extraction. They report field-level confidence scores and call it accuracy. But in AP workflows, an invoice that extracts cleanly can still be wrong — the line items can add up to a different total, the vendor GSTIN can have a digit transposed, or the amount can be a statistical outlier against vendor history. None of these failures are caught by OCR accuracy metrics. Reconciliation is the layer that catches them.
Pass contract (ordering and hard stops)
- Order is fixed: syntactic → semantic → pattern. Semantic checks assume syntactic closure (totals, tax lines, and line math already balance or are explicitly flagged).
- Any failed check blocks release to the accounting export path for that document revision—there is no “warn and post anyway” path for finance-grade mode.
- Operator UX: the queue shows which pass failed, the rule name, extracted value, and expected value or threshold—so resolution time tracks diagnostic quality, not guesswork.
Production cycle-time numbers (e.g. median upload → export-ready in benchmarks) include validation through to ERP master checks where deployed—approval SLAs are excluded because depth varies by customer.
Check matrix by pass
Representative checks—your deployment may enable a subset depending on master data, PO/GRN availability, and policy.
| Pass | Example checks | Typical failure surfaced |
|---|---|---|
| 1 — Syntactic | Σ line = subtotal; CGST/SGST/IGST rates; grand total round-trip; qty × unit vs line amount | Wrong column mapping, dropped tax row, OCR digit transposition in totals |
| 2 — Semantic | GSTIN ↔ vendor master; invoice/due dates; PO line match; GRN qty; duplicate multi-signal score | Trade vs legal name, period mismatch, three-way break, resubmitted duplicate |
| 3 — Pattern | Vendor history z-score on amount; resubmit velocity; cross-vendor amount collision window | Fraud / error spikes, duplicate-by-amount patterns, anomalous round totals |
Syntactic pass — arithmetic and unit coherence
Structural math checks that run on the raw extraction output before any business logic is applied.
Line-item math check
Sum of line item totals must equal invoice subtotal. Any delta — however small — is flagged before extraction output is accepted.
Tax arithmetic validation
CGST + SGST / IGST calculation verified against applicable rates. Mismatched rates trigger a hold, not a silent pass-through.
Grand total consistency
Subtotal + tax breakdowns must reconcile to the printed grand total. Round-trip arithmetic errors are the most common silent failure in OCR pipelines.
Currency and unit coherence
Quantity × unit price checked against line totals. Mixed-unit errors (pcs vs kg) caught here before downstream posting.
Semantic pass — identity, matching, and deduplication
Cross-reference checks against master data and existing records.
Vendor identity resolution
Extracted vendor name, GSTIN, and bank details cross-checked against the vendor master. Near-matches are flagged, not auto-accepted.
Date and period validation
Invoice date, due date, and tax period checked for logical consistency and against current period windows.
PO and GRN matching
Where applicable, invoice line items are matched against open PO lines and GRN quantities. Three-way match discrepancies are queued for review.
Duplicate signal scoring
Multi-signal duplicate check runs in the semantic pass: invoice number, vendor, amount, date, and hash-based similarity.
Pattern pass — anomaly detection and fraud signals
Statistical checks against historical vendor behaviour — the layer most platforms skip entirely.
Vendor pattern baseline
Invoice compared against this vendor's historical document patterns — typical formats, field positions, and value ranges.
Amount anomaly detection
Grand total z-scored against vendor history. Unusual spikes or round-number anomalies are routed to exception queue.
Frequency and resubmission signals
Short-interval resubmissions from the same vendor are flagged as potential duplicate risk even when invoice numbers differ.
Cross-vendor correlation
Identical amounts across multiple vendors in a short window are logged as a fraud-risk signal for human review.
Exception routing after a failed pass
Any failed check — syntactic, semantic, or pattern — routes to a typed exception queue rather than rejecting the document outright or silently passing it. Operators see the exact check that failed, the extracted value, and the expected value or threshold. Resolution is audited and logged as part of the AP audit trail.
Math error
Syntactic hold → operator correction or vendor re-invoice request
Vendor mismatch
Semantic hold → vendor master update or exception approval
Amount anomaly
Pattern hold → manager review with historical context