Read Every Document.
Once.
Invoices, contracts, claims, KYC packs, expense receipts, supplier forms. Someone on your team is still opening them, reading them, and retyping the bits that matter into another system. We replace that step with software that reads the document once and posts the data everywhere it needs to go.
$12.88
ALL OTHERS COST PER INVOICE (ARDENT, 2025)
17.4 days
ALL OTHERS INVOICE CYCLE (ARDENT, 2025)
Days
TO FIRST PIPELINE IN PRODUCTION
OCR isn't enough anymore.
The old version of document automation usually meant template rules, supplier-specific workflows and a support contract for every edge case. It worked on tidy invoices. It fell over on mixed packs, bad scans and anything with tables, stamps or handwritten notes.
Vision-language models suit messy documents because they read the page as an image and return structured data. Claude, Gemini, GPT and open-weight models still need testing on your own files, but the engineering problem has shifted from "can we read this page?" to "can we validate and post the result safely?"
That second question is the one worth building properly.
YESTERDAY'S OCR STACK
- A template per supplier, breaks on a logo change
- Confidence scores nobody trusts
- Tables and stamps it can't see
- A human reviewing every line anyway
- Enterprise licences, locked-in contracts
VLM-BASED PIPELINE
- One pipeline, tested layouts, fewer templates
- Field-level checks before anything posts
- Tables, signatures and stamps checked
- Humans only see the edge cases
- Document-level cost and accuracy tracking
Every document pipeline. Same five parts.
Whether it's an invoice from a builder, a referral letter from a GP or a 60-page lease, the work is the same five steps. Once you can see the parts, you can build them once.
Intake
A shared inbox, a Dropbox folder, an upload widget, an EDI feed. Wherever the documents already arrive. We don't ask your suppliers to change.
Read
A vision model extracts the fields you asked for. Totals, dates, parties, line items, VAT, signatures. As structured data with a confidence score per field.
Check
Totals add up. VAT number checks pass. Supplier exists. PO is open. Date is in range. The rules your team already runs in their head, written down once.
Review
The clean ones post through on their own. The edge cases land in a review queue with the document and the extraction side by side. One click to accept, fix or reject.
Post
Into Xero, Sage, NetSuite, Salesforce, your CRM, your warehouse, your bespoke app. Whatever the document was going to become, it becomes it. With a full audit trail.
The documents that eat your team's week.
Anything that arrives as a PDF, scan, photo or email attachment and ends up in someone's inbox to be typed up. The ones we replace most often:
Supplier invoices & credit notes
PO matching, VAT validation, GL coding, posting to Xero or Sage. Usually the first finance workflow worth inspecting.
Contracts & MSAs
Parties, term, renewal dates, liability caps, payment terms. Pulled into a register your legal and ops teams can actually search.
KYC, KYB & right-to-work
Passports, driving licences, share registers, Companies House filings. Captured, checked and retained against your policy and sector rules.
Insurance claims & FNOL
First notice of loss forms, repair quotes, medical reports, photos. Triaged, summarised, attached to the right claim file.
Customs, freight & PODs
Commercial invoices, packing lists, C88 / CDS entries, proof of delivery. Tied back to shipments in your WMS.
Clinical letters & referrals
NHS referrals, discharge summaries, lab results. Structured into the right record, with the bits that need a clinician's eyes flagged for them.
Where we come in.
Audit your document flow, build the first pipeline, then add the next one. No "platform" to buy. No per-page pricing that scales with your growth.
You own the code, the prompts, the model choice and the data. We can host it on your stack or ours. If your AP team is on Xero today, they're still on Xero tomorrow.
BOOK A DOCUMENT AUDITDocument audit
Send us a week's worth of the documents that hurt most. We map who touches them, where they go, what gets retyped and why. Scope and price up front, no slide decks.
Pick the model, write the schema
We choose the model that wins on your document type (Claude, Gemini, GPT or open-weight), define the fields you actually need, and write the rules a reviewer would use to accept or reject. You see real extractions on your real documents before we wire anything up.
Launch the first pipeline
Intake, read, check, review queue, post. Running on your real volume, with your team in the loop. We run it alongside the manual process until the numbers match.
Watch, learn, add the next one
Every reviewer correction feeds the next prompt, rule or schema change. We track accuracy per field, cost per document, and where humans still get pulled in. Once it's steady, we move on to the next document type.
Built to pass an audit.
Documents carry the most sensitive data your business handles: National Insurance numbers, bank details, medical history, contract terms. Sending that material to a generic scanner without a DPIA, a processor contract and access controls is asking for trouble.
Pipelines log every extraction, reviewer decision and model version. Personal data stays in the regions you approve. Anything sent to a model is redacted where possible, run under agreed processing terms, or routed to a self-hosted model. Subject access requests start from searchable logs, not inbox archaeology.
UK GDPR & DPA 2018
DPIAs where processing is likely to be high risk, Article 22 checks for solely automated decisions with legal or similarly serious effects, lawful basis recorded per source. We follow the ICO's AI and data protection guidance.
Making Tax Digital
From 6 April 2026, MTD for Income Tax applies to sole traders and landlords with more than £50,000 in qualifying income. HMRC requires compatible software to create digital records of business income and expenses.
Consumer Duty & accounts rules
For financial services firms and law firms: review queues for customer-facing decisions, evidence for Consumer Duty outcome monitoring, and records that match the SRA's accounts-rule expectation for accurate chronological records.
Signatures & seals
Signature detection, advanced and qualified electronic signature support, plus an audit trail aligned with UK eIDAS trust-service concepts for signatures, seals and timestamps.
Why the off-the-shelf tools aren't enough.
There are public examples of what goes wrong when documents and AI meet without proper controls. Three worth knowing.
Samsung's three AI leaks.
Samsung semiconductor staff reportedly pasted source code, test sequences for identifying defects and meeting content into ChatGPT. Media reports put the three incidents inside twenty days. Samsung restricted generative AI on company devices in May 2023.
$12.88 per invoice. Still.
Ardent Partners' AP Metrics That Matter in 2025 puts invoice processing cost at $12.88 for all others and $2.78 for its top 20% group. Processing time is 17.4 days for all others and 3.1 days for that top group.
ICO AI guidance, in writing.
The ICO's AI and data protection guidance says AI projects need DPIA thinking, lawful basis, transparency, fairness and accuracy work. The ICO also flags Article 22 checks when a system makes solely automated decisions with legal or similarly serious effects.
Sources checked: Ardent Partners' AP Metrics That Matter in 2025, ICO guidance on AI and data protection, GOV.UK MTD guidance, FCA Consumer Duty guidance, SRA Accounts Rules, GOV.UK UK eIDAS guidance, CNBC / Bloomberg and trade press coverage of the Samsung incident.
The ones we get asked first.
What about hallucinations? I can't have made-up totals on an invoice.
Right. Two things stop it. First, we extract with a confidence score per field and bounce anything low into the review queue. Second, we run deterministic checks: line totals add up, VAT matches the rate, PO exists, supplier on file. If a number doesn't reconcile, the pipeline holds the document for a human. The AI never gets the last word on money.
Why not just use Xero / Dext / Hubdoc / Rossum / Klippa?
For straight supplier invoices into a single accounting package, they're fine. We've recommended them. We reach for bespoke when the documents are unusual (claims forms, lease packs, NHS letters), the downstream system isn't on the integration list, or you need fields the tool doesn't extract.
Where does the document data actually go? Who sees it?
You decide, document type by document type. Sensitive ones run through model endpoints in the UK or EU under a data processing agreement, or against a self-hosted open-weight model. Less sensitive ones can use a hosted frontier model under agreed processing terms. We write it into the DPIA before we touch your data.
How long until it's live and earning back?
First pipeline in days, depending on document complexity and how clean your downstream integrations are. We run it parallel to the manual process for a stretch so you can see the numbers match before anyone changes their day job.
What does it cost per document?
We price it after benchmarking your documents. Model cost, review time, failure rate and downstream integration work all matter. Ardent's 2025 accounts payable report puts invoice processing cost at $12.88 for all others and $2.78 for its top 20% group, which is the gap we use for the first business case.
What about handwriting? Stamps? Foreign-language invoices?
Sometimes handled, sometimes not. Handwriting, stamps and foreign-language invoices vary wildly by scan quality and format, so we benchmark on your real inputs before promising a number. The bad ones go to the review queue; nothing posts on a guess.
Does my team need to use a new app?
Almost never. The intake stays where it is (email inbox, shared drive, portal upload). The data lands in your existing systems. The only new thing is a small review screen for the documents that need a human eye.
Who owns it once it's built?
You do. Source code, prompts, evals, extracted data. We can host and run it for you, or hand it to your team. No platform lock-in, no per-document pricing that punishes you for growing.
Got a stack of documents nobody wants to type up?
Send us a week's worth. You'll come away with a clear answer on which to automate first, what's safe to leave alone, and what a real pipeline would look like. Thirty minutes, no slides.