Number7AI — Docs

Production benchmarks

Benchmarks from real AP deployments on Indian document sets — not controlled demo conditions. We report the metrics that matter for operations: straight-through processing, ERP-safe accuracy, exception volume, and cycle time.

Last updated: April 2026

TL;DR

STP moved from industry baseline ~60% to 90%+ in documented deployment conditions on Indian AP.
Field accuracy reached 99.5% in production; 98.7% on mixed pilot datasets with diverse layouts.
Error volume dropped ~90% in a high-volume BPO AP workflow.
Median upload-to-export-ready time: ~4.2 minutes including validation.

Why we measure this, not OCR accuracy

Headline OCR accuracy tells you how many characters were recognised correctly. It does not tell you how many invoices were posted correctly. A document can have 99.8% character accuracy and still produce a wrong ERP record if unit price and quantity are in the wrong columns.

The metrics below reflect operational reality: how many documents flow end-to-end without human touch, how many errors reach the exception queue, and how fast a document moves from upload to export-ready state.

90%+

Straight-through processing

vs ~60% baseline

99.5%

Field accuracy (production)

on Indian AP documents

~90%

Error volume reduction

in BPO AP workflow

4.2 min

Median cycle time

upload → export-ready

Core processing metrics

Metric	Industry baseline	AIdaptIQ (observed)
Straight-through processing rate	~60%	90%+
AP field accuracy (production)	Varies by template and quality	99.5%
Accuracy on mixed pilot dataset	Manual baseline	98.7%
Error volume at scale (BPO)	~2,500/month	<250/month

Speed and throughput

Stage	Observed time	Notes
Single invoice extraction	<30 seconds	Including validation pass
Bulk PDF (10–40 invoices)	2–5 minutes	Includes boundary detection
Upload → export-ready state	Median ~4.2 min	Extraction + validation; excludes approval SLA

Residual failure rates by class

Rates after AIdaptIQ processing on Indian production documents. Residual failures are routed to exception queues — they do not silently post.

Failure class	Residual range	Disposition
Multi-row table continuity	~1.5–2%	Flagged for review
Locale/format numeric	<0.5% → ~5%	Auto-corrected (known) or flagged
Tax amount inference	~8%	Flagged with context
Vendor identity mismatch	~2–5%	Flagged for master match

Methodology and caveats

What is included
Metrics combine production deployments and pilot datasets across Indian AP workflows. Accuracy is field-level where possible, not just document-level pass/fail.
Cycle time scope
Cycle-time values include extraction and validation. Approval timing varies by customer policy and is not included in the median.
When numbers will be lower
Mixed-language, multi-currency documents and low-quality scans can lower first-pass confidence. Exception queue volume goes up.
How to compare fairly
Use your own production documents, not demo samples. Measure STP, silent error rate, and exception volume — not just headline accuracy.

FAQ

Do these numbers include validation or only extraction?: Workflow performance combining extraction, validation, and exception handling. Not raw OCR numbers.
Is 99.5% guaranteed for every document set?: No. It is observed in specific AP conditions on Indian documents. Complexity and scan quality affect outcomes. Run your own pilot to get numbers for your corpus.
What should I compare first?: Compare STP, residual error volume, and time-to-export-ready on your real documents — not sanitised demo samples.