Number7AI — Docs

Production benchmarks

Benchmarks from real AP deployments on Indian document sets — not controlled demo conditions. We report the metrics that matter for operations: straight-through processing, ERP-safe accuracy, exception volume, and cycle time.

Last updated: April 2026

TL;DR

  • STP moved from industry baseline ~60% to 90%+ in documented deployment conditions on Indian AP.
  • Field accuracy reached 99.5% in production; 98.7% on mixed pilot datasets with diverse layouts.
  • Error volume dropped ~90% in a high-volume BPO AP workflow.
  • Median upload-to-export-ready time: ~4.2 minutes including validation.

Why we measure this, not OCR accuracy

Headline OCR accuracy tells you how many characters were recognised correctly. It does not tell you how many invoices were posted correctly. A document can have 99.8% character accuracy and still produce a wrong ERP record if unit price and quantity are in the wrong columns.

The metrics below reflect operational reality: how many documents flow end-to-end without human touch, how many errors reach the exception queue, and how fast a document moves from upload to export-ready state.

90%+

Straight-through processing

vs ~60% baseline

99.5%

Field accuracy (production)

on Indian AP documents

~90%

Error volume reduction

in BPO AP workflow

4.2 min

Median cycle time

upload → export-ready

Core processing metrics

MetricIndustry baselineAIdaptIQ (observed)
Straight-through processing rate~60%90%+
AP field accuracy (production)Varies by template and quality99.5%
Accuracy on mixed pilot datasetManual baseline98.7%
Error volume at scale (BPO)~2,500/month<250/month

Speed and throughput

StageObserved timeNotes
Single invoice extraction<30 secondsIncluding validation pass
Bulk PDF (10–40 invoices)2–5 minutesIncludes boundary detection
Upload → export-ready stateMedian ~4.2 minExtraction + validation; excludes approval SLA

Residual failure rates by class

Rates after AIdaptIQ processing on Indian production documents. Residual failures are routed to exception queues — they do not silently post.

Failure classResidual rangeDisposition
Multi-row table continuity~1.5–2%Flagged for review
Locale/format numeric<0.5% → ~5%Auto-corrected (known) or flagged
Tax amount inference~8%Flagged with context
Vendor identity mismatch~2–5%Flagged for master match

Methodology and caveats

  • What is included

    Metrics combine production deployments and pilot datasets across Indian AP workflows. Accuracy is field-level where possible, not just document-level pass/fail.

  • Cycle time scope

    Cycle-time values include extraction and validation. Approval timing varies by customer policy and is not included in the median.

  • When numbers will be lower

    Mixed-language, multi-currency documents and low-quality scans can lower first-pass confidence. Exception queue volume goes up.

  • How to compare fairly

    Use your own production documents, not demo samples. Measure STP, silent error rate, and exception volume — not just headline accuracy.

FAQ

Do these numbers include validation or only extraction?
Workflow performance combining extraction, validation, and exception handling. Not raw OCR numbers.
Is 99.5% guaranteed for every document set?
No. It is observed in specific AP conditions on Indian documents. Complexity and scan quality affect outcomes. Run your own pilot to get numbers for your corpus.
What should I compare first?
Compare STP, residual error volume, and time-to-export-ready on your real documents — not sanitised demo samples.