Enterprise Operations · Case Study

$1.2M Recovered by Auditing Every P-Card Transaction

Manual sample audit is the standard at most enterprises. It catches the obvious problems and misses everything underneath. We built an AI pipeline that reads every transaction, classifies the spend, flags exceptions against policy, and surfaces patterns no human team can see at that volume.

Representative example. Client name and some specifics have been generalized for privacy.

Book a Free AI Audit

$1.2M

Projected annual recoveries

100%

Transaction coverage (vs ~1% manual sample)

−75%

Manual audit review hours

6 wks

Build to live deployment

One of the largest operations organizations in the world processes procurement card transactions at a volume no human audit team can fully review. Millions of line items per year, across thousands of cardholders, across dozens of policy categories. The internal audit function had developed a sampling protocol — pull a representative one percent of transactions per quarter, audit those manually, extrapolate findings, file a report. It was the standard approach for an enterprise of that scale. It also meant ninety-nine percent of the transaction base was effectively unreviewed.

The cost of that gap was opaque. Sampling produces a confidence interval, not a guarantee. Misclassified transactions, policy violations, missed cost-center recoveries, and unallowable spend all sit in the unsampled volume. None of them generate a flag. The auditor general's office knew this was a structural blind spot. The capacity to do anything about it didn't exist.

The Problem Beneath the Problem

P-Card audit isn't hard work — it's structured work at impossible scale. The auditor reads the transaction line, looks up the policy category, checks the merchant against the approved-vendor list, verifies the cost center coding, and flags anything off. That sequence repeats per transaction. A skilled auditor can review maybe two hundred transactions per day, sustained, before fatigue degrades accuracy.

At the volume this organization processes, audit was capped not by human judgment but by human throughput. The judgment was simple. The throughput was the bottleneck. That is exactly the shape of work AI handles better than humans, at a unit cost that doesn't scale linearly with volume.

What Got Built

A multi-stage AI pipeline that ingested every P-Card transaction from the upstream ledger system on a daily cadence. Each transaction passed through four classification layers: merchant category against the approved-vendor list, spend category against policy taxonomy, cost-center coding against the requestor's department, and anomaly detection against the cardholder's historical pattern. Anything that failed a check was flagged with the specific failure reason, the relevant policy section, and a recommended action — re-code, escalate, recover, or accept with note.

The pipeline produced a daily exception report and a monthly recovery summary. The audit team's role shifted from "sample and review" to "review the flagged exceptions" — an order of magnitude smaller workload, applied to the transactions where review actually mattered.

The integration was deliberate. The pipeline ran inside the organization's existing infrastructure (cloud account, identity provider, data warehouse). No data left the environment. The audit team retained final adjudication authority on every flagged item — the AI surfaces, the human decides. That posture matters at enterprise scale; the pipeline is an instrument, not a replacement for the auditor.

The build, including the four classifier layers, the policy taxonomy ingestion, and the integration into the daily audit workflow, took six weeks.

The Twelve-Month Modeled Outcome

Across the first twelve months of operation, the pipeline produced a projected $1.2 million in annual recoveries — a figure built from a combination of recovered misclassifications, identified policy violations, and cost-center re-allocations that improved the accuracy of internal expense attribution.

The recoveries fell into four categories, each material in its own right.

Recovered tax-exempt purchases: a meaningful share of the transaction base was eligible for sales tax exemption that the manual audit, working from samples, had been missing. The pipeline caught these at the line-item level and flagged them for refund processing.

Identified out-of-policy spend: a long tail of small policy violations — cardholders making purchases that fell outside their approved categories — that individually were below any human audit threshold but collectively represented real leakage.

Vendor consolidation candidates: the pipeline surfaced patterns of similar purchases made through multiple non-preferred vendors when an approved vendor with negotiated pricing existed. Procurement converted these into renegotiated contracts.

Cost-center accuracy: re-coding suggestions that improved the accuracy of internal cost attribution, which directly affected departmental budget reporting and chargeback math.

The audit team's manual review hours dropped by roughly seventy-five percent. The hours that remained were spent on flagged exceptions and on adjudicating edge cases, not on the throughput work the pipeline now handled. Auditor satisfaction went up. The most common feedback was that the work the team was doing now actually required their judgment.

The Lesson

The P-Card audit case is the cleanest expression of what custom AI is good at: structured work, large volume, clear policy, where the cost of incomplete review is invisible until you do complete review and see what was hiding underneath. The math of the engagement was straightforward — the build cost is bounded, the recoveries scale with transaction volume, the payback is measured in months. The qualitative shift was that an audit function which had been operating on sampling-derived confidence intervals was now operating on full coverage.

Most enterprises have a structurally similar gap somewhere in their operations: a high-volume process subject to clear rules, where manual review is the bottleneck and the unreviewed remainder is silently expensive. AI doesn't solve every problem. It is exceptionally good at this specific shape of problem.

Maqro AI Services Used

AI Agents →Analytics →

Every engagement combines the specific services that address your highest-impact opportunities — not a predetermined package.

More Case Studies

Document Operations & Compliance

Enterprise compliance program

AI audit at 100% coverage, not a 1% sample. Two million documents reviewed in days, not quarters.

A multi-year document archive needed a full compliance audit against client policy. Manual sampling at one percent per quarter would have taken years and missed the long tail of policy gaps the sample didn't touch.

Operations & Customer Service

Multi-team operations org

"Where is my order" answered against live data — eighty percent self-served, three days saved per ops lead per week.

Ops leads were spending most of their week answering the same order-status question across email, chat, and voice — repeatedly, for the same set of customers and internal stakeholders.

Ready to be the next case study?

Book a free 45-minute AI audit. We’ll identify the highest-impact opportunity in your business and show you exactly what measurable results look like for your workflows.

Book Your Free AI Audit