The Challenge: Duplicate and Fraudulent Claims

Finance teams are under constant pressure to control spend, yet duplicate and fraudulent expense claims are often hidden in thousands of small invoices, receipts and card transactions. A reused taxi receipt here, a split hotel bill there, a slightly renamed vendor — individually they look harmless. At scale, they erode margins, weaken trust in expense policies and absorb countless hours of manual review.

Traditional controls rely on keyword rules, simple amount thresholds and sample-based audits. These approaches struggle with today’s volume and variety of expense data: scanned receipts in different languages, mixed corporate and personal spend on the same trip, subscription services billed in subtle ways. Human reviewers simply cannot read and cross-check every line item, and classic rule engines can be gamed once employees understand how they work.

The business impact goes beyond the direct financial loss. Weak expense fraud detection undermines your internal control system and exposes you during audits. Budget owners lose confidence in reported numbers, finance spends time firefighting exceptions instead of advising the business, and opportunities to optimise travel, procurement and subscription spend remain unseen. Competitors who automate this layer gain cleaner books, faster closes and better cost visibility.

Yet this challenge is solvable. Modern AI systems like Claude can read long expense reports, invoices and travel logs in context, detect patterns that humans miss, and flag suspicious claims before they are reimbursed. At Reruption, we’ve seen how applying an AI-first lens to document-heavy processes transforms control quality and team productivity. The rest of this page walks through how you can use Claude to bring the same rigor to duplicate and fraudulent claims in your finance function.

Need a sparring partner for this challenge?

Let's have a no-obligation chat and brainstorm together.

Innovators at these companies trust us:

Our Assessment

A strategic assessment of the challenge and high-level tips how to tackle it.

From Reruption’s work building AI-powered document analysis and internal assistants, we’ve seen that Claude is particularly strong at handling long, messy financial artefacts: expense reports with many attachments, travel itineraries, internal expense policies and card statements. Our perspective is simple: used correctly, Claude for expense fraud detection becomes a tireless reviewer that cross-checks every claim against policies and patterns, while still producing explanations finance controllers can understand and challenge.

Treat AI as a Second Pair of Eyes, Not a Black Box Judge

The first strategic decision is to position Claude as a review assistant, not an autonomous decision-maker. In finance, control quality and auditability matter as much as speed. Claude should pre-screen expense reports, highlight potential duplicates and fraudulent claims, and surface structured reasons like “same receipt image used on these three dates” or “vendor not found in supplier master”. Final approval remains with a human controller.

This framing reduces organisational resistance and audit risk. Controllers stay in charge, but their attention is focused on the 5–10% of claims that Claude ranks as most suspicious. Over time, you can gradually increase automation for low-risk, repetitive cases once your team trusts the system’s behaviour and understands its limits.

Design Around Your Policies and Risk Appetite

Claude is most effective when it is configured around your actual expense policies, risk thresholds and approval workflows. Strategically, that means translating policy PDFs and scattered guidelines into clear, machine-readable rules and examples: what counts as a duplicate, how per-diem limits work, which vendors are considered high-risk, what documentation is required for each expense type.

Use Claude to interpret and normalise these policies, but define the “red lines” centrally: which violations automatically block reimbursement, which only trigger a comment, and which data should be written back into your ERP or expense management tool. This ensures AI-powered checks reinforce your internal control framework rather than creating a parallel system with conflicting logic.

Start with High-Volume, High-Ambiguity Categories

Not every expense category needs AI on day one. Strategically, focus Claude on high-volume, high-ambiguity spend where traditional rules underperform: travel and entertainment, subscriptions, miscellaneous reimbursements and vendor invoices from long-tail suppliers. These are exactly the areas where duplicate and fraudulent claims tend to hide.

By narrowing scope, you reduce implementation complexity and can demonstrate value quickly. Once you have proven detection improvements and controller acceptance in these categories, extend coverage to other areas such as mileage claims, training budgets or marketing expenses.

Prepare Teams for a Shift from Data Entry to Investigation

Introducing Claude in finance workflows changes the role of your people. Less time is spent on manual checks (e.g. “does this receipt match this line item?”) and more on investigative work: reviewing AI-flagged anomalies, asking follow-up questions, and refining policies. Strategically, you need to prepare your team for this shift in skill profile and mindset.

Invest in basic AI literacy, transparent training data and examples, and clear escalation paths for disputed cases. Make it explicit that the goal is to reduce low-value manual work, not headcount. When controllers see that AI helps them catch patterns they would have missed — like systematic receipt reuse by a specific cost centre — adoption becomes much easier.

Engineer for Traceability and Compliance from Day One

Finance leaders need assurance that any AI-based fraud detection is explainable, auditable and compliant with data protection rules. Strategically, that means designing your Claude integration to store the prompts, model outputs and key decision signals for each checked claim. This creates a traceable trail you can show to internal audit or external auditors.

Work with legal, compliance and IT security early to define data boundaries: which fields are sent to Claude, how PII is handled, where logs are stored, and who can access what. Reruption’s work on secure AI document analysis has shown that involving these stakeholders at the outset dramatically accelerates later approvals and reduces the risk of having to redesign the solution under regulatory pressure.

Used with the right strategy, Claude becomes a scalable defence against duplicate and fraudulent expense claims — reading every document, cross-checking every pattern, and surfacing clear explanations your finance team can act on. Reruption specialises in turning that potential into concrete, secure workflows that match your policies, systems and risk appetite. If you want to test this in your own environment, our AI PoC format makes it straightforward to validate detection quality and effort before you invest in a full rollout.

Need help implementing these ideas?

Feel free to reach out to us with no obligation.

Real-World Case Studies

From Investment Banking to Banking: Learn how companies successfully use Claude.

Goldman Sachs

Investment Banking

In the fast-paced investment banking sector, Goldman Sachs employees grapple with overwhelming volumes of repetitive tasks. Daily routines like processing hundreds of emails, writing and debugging complex financial code, and poring over lengthy documents for insights consume up to 40% of work time, diverting focus from high-value activities like client advisory and deal-making. Regulatory constraints exacerbate these issues, as sensitive financial data demands ironclad security, limiting off-the-shelf AI use. Traditional tools fail to scale with the need for rapid, accurate analysis amid market volatility, risking delays in response times and competitive edge.

Lösung

Goldman Sachs countered with a proprietary generative AI assistant, fine-tuned on internal datasets in a secure, private environment. This tool summarizes emails by extracting action items and priorities, generates production-ready code for models like risk assessments, and analyzes documents to highlight key trends and anomalies. Built from early 2023 proofs-of-concept, it leverages custom LLMs to ensure compliance and accuracy, enabling natural language interactions without external data risks. The firm prioritized employee augmentation over replacement, training staff for optimal use.

Ergebnisse

  • Rollout Scale: 10,000 employees in 2024
  • Timeline: PoCs 2023; initial rollout 2024; firmwide 2025
  • Productivity Boost: Routine tasks streamlined, est. 25-40% time savings on emails/coding/docs
  • Adoption: Rapid uptake across tech and front-office teams
  • Strategic Impact: Core to 10-year AI playbook for structural gains
Read case study →

NYU Langone Health

Healthcare

NYU Langone Health, a leading academic medical center, faced significant hurdles in leveraging the vast amounts of unstructured clinical notes generated daily across its network. Traditional clinical predictive models relied heavily on structured data like lab results and vitals, but these required complex ETL processes that were time-consuming and limited in scope. Unstructured notes, rich with nuanced physician insights, were underutilized due to challenges in natural language processing, hindering accurate predictions of critical outcomes such as in-hospital mortality, length of stay (LOS), readmissions, and operational events like insurance denials. Clinicians needed real-time, scalable tools to identify at-risk patients early, but existing models struggled with the volume and variability of EHR data—over 4 million notes spanning a decade. This gap led to reactive care, increased costs, and suboptimal patient outcomes, prompting the need for an innovative approach to transform raw text into actionable foresight.

Lösung

To address these challenges, NYU Langone's Division of Applied AI Technologies at the Center for Healthcare Innovation and Delivery Science developed NYUTron, a proprietary large language model (LLM) specifically trained on internal clinical notes. Unlike off-the-shelf models, NYUTron was fine-tuned on unstructured EHR text from millions of encounters, enabling it to serve as an all-purpose prediction engine for diverse tasks. The solution involved pre-training a 13-billion-parameter LLM on over 10 years of de-identified notes (approximately 4.8 million inpatient notes), followed by task-specific fine-tuning. This allowed seamless integration into clinical workflows, automating risk flagging directly from physician documentation without manual data structuring. Collaborative efforts, including AI 'Prompt-a-Thons,' accelerated adoption by engaging clinicians in model refinement.

Ergebnisse

  • AUROC: 0.961 for 48-hour mortality prediction (vs. 0.938 benchmark)
  • 92% accuracy in identifying high-risk patients from notes
  • LOS prediction AUROC: 0.891 (5.6% improvement over prior models)
  • Readmission prediction: AUROC 0.812, outperforming clinicians in some tasks
  • Operational predictions (e.g., insurance denial): AUROC up to 0.85
  • 24 clinical tasks with superior performance across mortality, LOS, and comorbidities
Read case study →

Klarna

Fintech

Klarna, a leading fintech BNPL provider, faced enormous pressure from millions of customer service inquiries across multiple languages for its 150 million users worldwide. Queries spanned complex fintech issues like refunds, returns, order tracking, and payments, requiring high accuracy, regulatory compliance, and 24/7 availability. Traditional human agents couldn't scale efficiently, leading to long wait times averaging 11 minutes per resolution and rising costs. Additionally, providing personalized shopping advice at scale was challenging, as customers expected conversational, context-aware guidance across retail partners. Multilingual support was critical in markets like US, Europe, and beyond, but hiring multilingual agents was costly and slow. This bottleneck hindered growth and customer satisfaction in a competitive BNPL sector.

Lösung

Klarna partnered with OpenAI to deploy a generative AI chatbot powered by GPT-4, customized as a multilingual customer service assistant. The bot handles refunds, returns, order issues, and acts as a conversational shopping advisor, integrated seamlessly into Klarna's app and website. Key innovations included fine-tuning on Klarna's data, retrieval-augmented generation (RAG) for real-time policy access, and safeguards for fintech compliance. It supports dozens of languages, escalating complex cases to humans while learning from interactions. This AI-native approach enabled rapid scaling without proportional headcount growth.

Ergebnisse

  • 2/3 of all customer service chats handled by AI
  • 2.3 million conversations in first month alone
  • Resolution time: 11 minutes → 2 minutes (82% reduction)
  • CSAT: 4.4/5 (AI) vs. 4.2/5 (humans)
  • $40 million annual cost savings
  • Equivalent to 700 full-time human agents
  • 80%+ queries resolved without human intervention
Read case study →

NVIDIA

Manufacturing

In semiconductor manufacturing, chip floorplanning—the task of arranging macros and circuitry on a die—is notoriously complex and NP-hard. Even expert engineers spend months iteratively refining layouts to balance power, performance, and area (PPA), navigating trade-offs like wirelength minimization, density constraints, and routability. Traditional tools struggle with the explosive combinatorial search space, especially for modern chips with millions of cells and hundreds of macros, leading to suboptimal designs and delayed time-to-market. NVIDIA faced this acutely while designing high-performance GPUs, where poor floorplans amplify power consumption and hinder AI accelerator efficiency. Manual processes limited scalability for 2.7 million cell designs with 320 macros, risking bottlenecks in their accelerated computing roadmap. Overcoming human-intensive trial-and-error was critical to sustain leadership in AI chips.

Lösung

NVIDIA deployed deep reinforcement learning (DRL) to model floorplanning as a sequential decision process: an agent places macros one-by-one, learning optimal policies via trial and error. Graph neural networks (GNNs) encode the chip as a graph, capturing spatial relationships and predicting placement impacts. The agent uses a policy network trained on benchmarks like MCNC and GSRC, with rewards penalizing half-perimeter wirelength (HPWL), congestion, and overlap. Proximal Policy Optimization (PPO) enables efficient exploration, transferable across designs. This AI-driven approach automates what humans do manually but explores vastly more configurations.

Ergebnisse

  • Design Time: 3 hours for 2.7M cells vs. months manually
  • Chip Scale: 2.7 million cells, 320 macros optimized
  • PPA Improvement: Superior or comparable to human designs
  • Training Efficiency: Under 6 hours total for production layouts
  • Benchmark Success: Outperforms on MCNC/GSRC suites
  • Speedup: 10-30% faster circuits in related RL designs
Read case study →

JPMorgan Chase

Banking

In the high-stakes world of asset management and wealth management at JPMorgan Chase, advisors faced significant time burdens from manual research, document summarization, and report drafting. Generating investment ideas, market insights, and personalized client reports often took hours or days, limiting time for client interactions and strategic advising. This inefficiency was exacerbated post-ChatGPT, as the bank recognized the need for secure, internal AI to handle vast proprietary data without risking compliance or security breaches. The Private Bank advisors specifically struggled with preparing for client meetings, sifting through research reports, and creating tailored recommendations amid regulatory scrutiny and data silos, hindering productivity and client responsiveness in a competitive landscape.

Lösung

JPMorgan addressed these challenges by developing the LLM Suite, an internal suite of seven fine-tuned large language models (LLMs) powered by generative AI, integrated with secure data infrastructure. This platform enables advisors to draft reports, generate investment ideas, and summarize documents rapidly using proprietary data. A specialized tool, Connect Coach, was created for Private Bank advisors to assist in client preparation, idea generation, and research synthesis. The implementation emphasized governance, risk management, and employee training through AI competitions and 'learn-by-doing' approaches, ensuring safe scaling across the firm. LLM Suite rolled out progressively, starting with proofs-of-concept and expanding firm-wide.

Ergebnisse

  • Users reached: 140,000 employees
  • Use cases developed: 450+ proofs-of-concept
  • Financial upside: Up to $2 billion in AI value
  • Deployment speed: From pilot to 60K users in months
  • Advisor tools: Connect Coach for Private Bank
  • Firm-wide PoCs: Rigorous ROI measurement across 450 initiatives
Read case study →

Best Practices

Successful implementations follow proven patterns. Have a look at our tactical advice to get started.

Build a Claude-Powered Pre-Check for Every Expense Report

Start by inserting a Claude pre-check step before human approval in your existing expense workflow. Export the full report (line items, employee, cost centre, trip details) plus images or PDFs of receipts, and feed them into Claude as a single structured prompt. Ask Claude to evaluate policy compliance, flag potential duplicates and highlight missing documentation.

System role (example):
You are an internal finance control assistant for ACME Corp. 
You know and apply ACME's expense policy precisely.
You must:
- Check each line item against the policy
- Detect possible duplicate claims across the provided data
- Flag suspicious vendors or descriptions
- Rate overall risk: low / medium / high
- Explain every flag in plain business English.

User content (example structure):
{
  "employee": {...},
  "trip": {...},
  "expense_policy": "<full policy text or summary>",
  "line_items": [...],
  "receipts": [base64 or links]
}

Return a structured JSON summary that your workflow engine or expense tool can consume. This enables automatic routing: low-risk reports can be fast-tracked, while high-risk ones go to senior controllers with Claude’s comments attached.

Use Image and Text Comparison to Catch Reused Receipts

Duplicate and fraudulent claims often rely on reused or manipulated receipts. To detect them, combine Claude’s text understanding with image fingerprinting from your internal stack. First, create a hash or similarity score for each uploaded receipt image and compare it against historical receipts to find likely duplicates or close matches.

Then pass the suspect pairs into Claude with explicit instructions to compare dates, amounts, vendors, line items and visual cues (such as logos or layout). Ask Claude to classify the pair as "likely duplicate", "possibly duplicate" or "different" and to explain its reasoning. This layered approach catches cases where employees slightly edit or crop receipts to circumvent naïve duplicate checks.

User prompt (example):
Compare the following two receipts and decide if they represent
(1) the same expense claimed twice,
(2) separate expenses with similar details, or
(3) unclear from the evidence.

Explain your reasoning in max 5 bullet points.

Receipt A OCR text:
...

Receipt B OCR text:
...

Cross-Check Expenses Against Internal Master Data

Many fake vendors and suspicious descriptions can be spotted by cross-checking claims against your internal master data. Create a service layer that exposes canonical lists of approved vendors, cost centres, projects and GL accounts. When sending data to Claude, include both the expense details and a snapshot of the relevant master data.

Prompt Claude to reconcile each claim: does the vendor appear in the list, is the cost centre consistent with the employee’s department, does the description plausibly match the GL account? Ask for a confidence score and short justification. This turns static master data into an active control mechanism without requiring a heavy rules engine implementation.

User prompt (example excerpt):
Here is our current vendor master list and cost centre structure.
Here are the expenses we want you to assess.

For each expense, answer:
- Is the vendor known? If not, why might that be risky?
- Is the cost centre plausible for this type of expense?
- Overall: OK, needs clarification, or likely policy breach.

Automate Policy Reasoning and Explanations

Claude excels at reading long policy documents and applying them consistently. Use this to convert your expense handbook and travel policy into an AI-enforced rulebook. Include the full policy text (or a curated summary) with each evaluation request, and ask Claude to cite specific sections or paragraphs when flagging an issue.

This not only improves control quality but also makes employee communication easier. When a claim is challenged, Claude can generate a short, polite explanation for the employee, referencing the relevant policy section. Controllers can then review and send, instead of drafting from scratch.

User prompt (example excerpt):
Based on the following ACME Expense Policy, review the expenses.

Policy:
<paste policy text>

For each violation, output:
- short_title
- explanation_for_controller
- explanation_for_employee (polite, reference policy section)

Score and Prioritise Anomalies for Human Review

To avoid overwhelming controllers with too many flags, design Claude outputs around risk scoring and prioritisation. For each claim or report, ask Claude to assign a risk level and to identify the 1–3 most critical anomalies to investigate first. Combine this with quantitative metrics (e.g. amount, frequency, employee history) in your own scoring logic.

In your workflow tool, use this combined score to drive SLAs and routing: high-risk claims must be reviewed within 24 hours by senior staff, while low-risk issues can wait or be sampled. Over time, analyse which Claude-flagged issues actually resulted in adjustments or rejections, and fine-tune prompts based on this feedback.

User prompt (example excerpt):
For the full report, give:
- overall_risk_score (1-100)
- top_3_risks: [ {type, severity, explanation} ]
- recommendation: approve / approve_with_comment / reject

Track KPIs and Iterate on Prompts Like a Product

Treat your Claude-based expense control as a product that needs ongoing optimisation. Define a few practical KPIs: percentage of reports automatically cleared as low risk, number of duplicates detected per 1,000 claims, average time saved per controller, and number of disputes where the employee successfully overturns a flag (false positives).

Review these metrics monthly. When you see too many false positives in a category, adjust prompts to be more conservative. When fraud cases slip through, add those as negative examples in your prompt or fine-tuning setup. Reruption’s AI PoC approach is built exactly around this loop: get a working prototype in production-like conditions, measure, refine, and then scale.

With a disciplined setup, finance teams typically see realistic outcomes such as a 30–50% reduction in manual review time for standard expense reports, a significant increase in detected duplicates and suspicious claims, and faster, better-documented approvals — all without changing their core ERP or expense platform.

Need implementation expertise now?

Let's talk about your ideas!

Frequently Asked Questions

Claude analyses the full context of an expense report: line items, receipts, travel details, card transactions and your own expense policy. It looks for patterns such as identical or very similar receipt content used multiple times, inconsistent dates between trips and invoices, unusual vendor names, and spend that doesn’t fit typical behaviour for a given role or cost centre.

Technically, it combines natural language understanding of descriptions and policies with structured comparisons of amounts, dates and vendors. When integrated with your systems, it can also use master data (e.g. approved vendors) and historical claims to spot anomalies that a rules engine or manual spot checks would miss.

At minimum, you need access to your expense data (reports, receipts, card feeds), an integration layer (often a small internal API or automation tool) and someone who can own the process from the finance side. On the technical side, skills in backend development and basic cloud infrastructure are helpful to securely connect Claude to your existing systems.

Finance teams do not need to become AI experts. Your main contribution is to clearly define policies, edge cases and decision rules. Reruption typically partners technical staff with finance stakeholders, using our own engineering team to handle prompt design, data pipelines and security so your controllers can focus on validating outputs and refining policies.

In our experience, a focused pilot for duplicate and fraudulent claims detection can be up and running in a few weeks, not months. Within the first 2–4 weeks, you can usually have a prototype that ingests real historical expense data, flags potential anomalies and provides explanations for controller review.

Meaningful results — such as a measurable increase in detected duplicates or a reduction in manual review time — often emerge within one or two accounting cycles. Reruption’s structured AI PoC format is designed exactly for this timeframe: we define the use case, build a working prototype, measure performance and outline a production plan, all within a compact project.

ROI comes from three main sources: prevented losses, saved time and improved control quality. Even in mid-sized organisations, low-level expense fraud and duplicate claims can quietly add up to significant annual amounts. Catching a fraction of these systematically often covers the AI running costs multiple times.

On the efficiency side, automating pre-checks and anomaly detection reduces manual review time per report, freeing controllers to focus on complex cases and analysis. There is also qualitative ROI: stronger internal controls, better audit readiness and more reliable spend data for strategic decisions. During a PoC, we typically quantify ROI with simple metrics like fraud/duplicate value detected, hours saved and manual checks reduced.

Reruption works as a Co-Preneur, embedding with your team to build real AI-powered finance workflows instead of just writing slide decks. We start with our 9.900€ AI PoC, where we scope your specific expense control challenges, design and build a Claude-based prototype that plugs into your data, and measure detection quality, speed and cost per run in your environment.

Beyond the PoC, we support you in hardening the solution: integrating with your ERP or expense tool, designing secure data flows, setting up monitoring and KPIs, and training controllers to work effectively with AI. Throughout, we bring deep engineering capabilities, an AI-first lens on your processes and a shared ownership mindset — acting like a co-founder of your internal AI product until it reliably catches duplicate and fraudulent claims at scale.

Contact Us!

0/10 min.

Contact Directly

Your Contact

Philipp M. W. Hoffmann

Founder & Partner

Address

Reruption GmbH

Falkertstraße 2

70176 Stuttgart

Social Media