Number7AI — Docs

Three-pass reconciliation

Extraction accuracy is a necessary but insufficient condition for AP readiness. Reconciliation is the second gate — the set of checks that converts raw extraction output into finance-grade, export-ready data.

Last updated: April 2026

TL;DR

  • Pass 1 (Syntactic): arithmetic and unit coherence — catches math errors and tax miscalculations.
  • Pass 2 (Semantic): vendor identity, PO/GRN matching, and duplicate detection.
  • Pass 3 (Pattern): anomaly scoring against vendor history and fraud-risk signals.
  • Any pass failure routes to an exception queue — it never silently passes.

Why extraction alone is not enough

Most IDP platforms stop at extraction. They report field-level confidence scores and call it accuracy. But in AP workflows, an invoice that extracts cleanly can still be wrong — the line items can add up to a different total, the vendor GSTIN can have a digit transposed, or the amount can be a statistical outlier against vendor history. None of these failures are caught by OCR accuracy metrics. Reconciliation is the layer that catches them.

1

Syntactic pass — arithmetic and unit coherence

Structural math checks that run on the raw extraction output before any business logic is applied.

  • Line-item math check

    Sum of line item totals must equal invoice subtotal. Any delta — however small — is flagged before extraction output is accepted.

  • Tax arithmetic validation

    CGST + SGST / IGST calculation verified against applicable rates. Mismatched rates trigger a hold, not a silent pass-through.

  • Grand total consistency

    Subtotal + tax breakdowns must reconcile to the printed grand total. Round-trip arithmetic errors are the most common silent failure in OCR pipelines.

  • Currency and unit coherence

    Quantity × unit price checked against line totals. Mixed-unit errors (pcs vs kg) caught here before downstream posting.

2

Semantic pass — identity, matching, and deduplication

Cross-reference checks against master data and existing records.

  • Vendor identity resolution

    Extracted vendor name, GSTIN, and bank details cross-checked against the vendor master. Near-matches are flagged, not auto-accepted.

  • Date and period validation

    Invoice date, due date, and tax period checked for logical consistency and against current period windows.

  • PO and GRN matching

    Where applicable, invoice line items are matched against open PO lines and GRN quantities. Three-way match discrepancies are queued for review.

  • Duplicate signal scoring

    Multi-signal duplicate check runs in the semantic pass: invoice number, vendor, amount, date, and hash-based similarity.

3

Pattern pass — anomaly detection and fraud signals

Statistical checks against historical vendor behaviour — the layer most platforms skip entirely.

  • Vendor pattern baseline

    Invoice compared against this vendor's historical document patterns — typical formats, field positions, and value ranges.

  • Amount anomaly detection

    Grand total z-scored against vendor history. Unusual spikes or round-number anomalies are routed to exception queue.

  • Frequency and resubmission signals

    Short-interval resubmissions from the same vendor are flagged as potential duplicate risk even when invoice numbers differ.

  • Cross-vendor correlation

    Identical amounts across multiple vendors in a short window are logged as a fraud-risk signal for human review.

Exception routing after a failed pass

Any failed check — syntactic, semantic, or pattern — routes to a typed exception queue rather than rejecting the document outright or silently passing it. Operators see the exact check that failed, the extracted value, and the expected value or threshold. Resolution is audited and logged as part of the AP audit trail.

Math error

Syntactic hold → operator correction or vendor re-invoice request

Vendor mismatch

Semantic hold → vendor master update or exception approval

Amount anomaly

Pattern hold → manager review with historical context