The Challenge: Uncategorized Expense Entries

In many organisations, uncategorized expense entries have become a persistent headache for Finance. Employees submit expenses with vague descriptions, missing cost centers, or simply select “Other” to get their claims through. Finance teams then spend days chasing missing information, decoding receipts, and manually assigning GL accounts and project codes. Month-end closes stretch out, and no one fully trusts the spend reports.

Traditional approaches—policy PDFs, training sessions, and manual review—are no longer enough. As transaction volumes grow across travel, procurement, and subscriptions, Finance can’t scale headcount just to read receipts. ERP and expense tools help enforce some rules, but they struggle with free-text descriptions, mixed-language entries, and messy receipt photos. Static rules-based engines break whenever vendors change formats or employees get creative with descriptions.

The business impact is significant. Misposted costs distort margins by product, customer, and project. Budget owners see spend data too late to act. Controllers lose days of productive time on low-value classification work instead of analysis. Poor expense visibility undermines expense control, hides policy violations, and slows decision-making—especially critical when cash and profitability are under pressure.

The good news: this problem is very solvable with today’s AI. By combining Gemini’s multimodal understanding (text + images) with your financial coding logic, you can dramatically reduce uncategorized entries and manual touchpoints. At Reruption, we’ve helped organisations build AI-first workflows around complex document and data processing. Below, we’ll walk through practical, finance-ready ways to apply Gemini to expense classification—without disrupting your core ERP landscape.

Need a sparring partner for this challenge?

Let's have a no-obligation chat and brainstorm together.

Innovators at these companies trust us:

Our Assessment

A strategic assessment of the challenge and high-level tips how to tackle it.

From Reruption’s work building AI-powered document and data workflows, we’ve seen that uncategorized expenses are almost never a tooling issue alone—they’re a process and design issue. Gemini is a powerful engine to interpret merchant names, texts, and receipt images, but its real value comes when you embed it into a clear expense control strategy, with the right guardrails around GL logic, cost centers, and approval flows.

Anchor Gemini in Your Finance Data Model, Not Just in the Expense Tool

Before you start sending receipts to Gemini, Finance needs a clear, documented data model: which GL accounts are allowed for which cost centers and projects, what typical spend patterns look like, and which categories are “high-risk” from a policy perspective. Without this backbone, even a strong AI model will produce inconsistent classifications.

Treat Gemini as an interpreter between messy real-world inputs and your structured chart of accounts. Define explicit mappings, priorities, and override rules that reflect how your controllers already think. This keeps AI outputs aligned with your accounting logic and reduces pushback from auditors and local finance teams.

Start with a Narrow, High-Volume Use Case

Rather than “AI for all expenses”, start with one narrow but high-volume area: for example travel expenses (flights, hotels, taxis) or software subscriptions. In these domains, merchant patterns and descriptions are relatively consistent, making it easier for Gemini to learn and for Finance to validate results.

This focus lets you set clear success metrics (e.g. “reduce uncategorized travel expenses from 25% to <5% in three months”) and gather feedback from a smaller group of employees and approvers. Once the pilot is stable and trusted, you extend the same patterns to other spend types.

Design for Human-in-the-Loop, Not Full Automation on Day One

For Finance leaders, the biggest risk is not that Gemini misclassifies a taxi ride—it’s losing control over the process. To manage adoption and risk, design a human-in-the-loop workflow first: Gemini proposes categories, cost centers, and policy flags; Finance (or managers) review and accept, edit, or reject.

This approach builds trust and gives you labelled feedback data to improve Gemini’s prompts and fine-tuning. Over time, you can set confidence thresholds—e.g. auto-accept when confidence > 0.9, route to review between 0.6–0.9, and block or escalate below 0.6—so automation grows where the model has proven reliable.

Prepare Your Team and Policies for AI-Augmented Expense Management

Introducing AI-driven expense classification changes who does what in the process. Controllers spend less time coding expenses and more time defining rules, testing samples, and monitoring anomalies. Employees get faster feedback on policy breaches. To avoid resistance, make these shifts explicit.

Update your expense policy with a short section on AI support: what Gemini does, how categories are suggested, what overrides are allowed, and how data is used. Train Finance staff on reading AI outputs, understanding confidence scores, and interpreting edge cases. The more comfortable they are, the faster they will lean into automation instead of bypassing it.

Build for Auditability, Traceability, and Compliance from Day One

For Finance, a powerful model is useless if it’s a black box. From the beginning, ensure that your Gemini integration stores the input (sanitised where needed), the suggested classification, the reasoning (where technically possible via prompts), the confidence score, and the final human decision.

This traceability gives auditors comfort and lets you run periodic back-testing: How often does Gemini match final postings? In which categories does it struggle? Combined with clear data protection and access controls, this helps you meet internal controls, compliance, and works council requirements while still reaping the efficiency gains.

Used deliberately, Gemini can turn uncategorized expense entries from a chronic irritant into a largely automated, auditable workflow. The key is not just calling an API, but aligning the model with your chart of accounts, policies, and people. At Reruption, we specialise in building these AI-first finance processes end-to-end—from proof-of-concept to production-grade integrations. If you want to see what Gemini could do on your real expense data, we’re ready to help you test it in a focused, low-risk way.

Need help implementing these ideas?

Feel free to reach out to us with no obligation.

Real-World Case Studies

From Human Resources to Food Manufacturing: Learn how companies successfully use Gemini.

Unilever

Human Resources

Unilever, a consumer goods giant handling 1.8 million job applications annually, struggled with a manual recruitment process that was extremely time-consuming and inefficient . Traditional methods took up to four months to fill positions, overburdening recruiters and delaying talent acquisition across its global operations . The process also risked unconscious biases in CV screening and interviews, limiting workforce diversity and potentially overlooking qualified candidates from underrepresented groups . High volumes made it impossible to assess every applicant thoroughly, leading to high costs estimated at millions annually and inconsistent hiring quality . Unilever needed a scalable, fair system to streamline early-stage screening while maintaining psychometric rigor.

Lösung

Unilever adopted an AI-powered recruitment funnel partnering with Pymetrics for neuroscience-based gamified assessments that measure cognitive, emotional, and behavioral traits via ML algorithms trained on diverse global data . This was followed by AI-analyzed video interviews using computer vision and NLP to evaluate body language, facial expressions, tone of voice, and word choice objectively . Applications were anonymized to minimize bias, with AI shortlisting top 10-20% of candidates for human review, integrating psychometric ML models for personality profiling . The system was piloted in high-volume entry-level roles before global rollout .

Ergebnisse

  • Time-to-hire: 90% reduction (4 months to 4 weeks)
  • Recruiter time saved: 50,000 hours
  • Annual cost savings: £1 million
  • Diversity hires increase: 16% (incl. neuro-atypical candidates)
  • Candidates shortlisted for humans: 90% reduction
  • Applications processed: 1.8 million/year
Read case study →

Mastercard

Payments

In the high-stakes world of digital payments, card-testing attacks emerged as a critical threat to Mastercard's ecosystem. Fraudsters deploy automated bots to probe stolen card details through micro-transactions across thousands of merchants, validating credentials for larger fraud schemes. Traditional rule-based and machine learning systems often detected these only after initial tests succeeded, allowing billions in annual losses and disrupting legitimate commerce. The subtlety of these attacks—low-value, high-volume probes mimicking normal behavior—overwhelmed legacy models, exacerbated by fraudsters' use of AI to evade patterns. As transaction volumes exploded post-pandemic, Mastercard faced mounting pressure to shift from reactive to proactive fraud prevention. False positives from overzealous alerts led to declined legitimate transactions, eroding customer trust, while sophisticated attacks like card-testing evaded detection in real-time. The company needed a solution to identify compromised cards preemptively, analyzing vast networks of interconnected transactions without compromising speed or accuracy.

Lösung

Mastercard's Decision Intelligence (DI) platform integrated generative AI with graph-based machine learning to revolutionize fraud detection. Generative AI simulates fraud scenarios and generates synthetic transaction data, accelerating model training and anomaly detection by mimicking rare attack patterns that real data lacks. Graph technology maps entities like cards, merchants, IPs, and devices as interconnected nodes, revealing hidden fraud rings and propagation paths in transaction graphs. This hybrid approach processes signals at unprecedented scale, using gen AI to prioritize high-risk patterns and graphs to contextualize relationships. Implemented via Mastercard's AI Garage, it enables real-time scoring of card compromise risk, alerting issuers before fraud escalates. The system combats card-testing by flagging anomalous testing clusters early. Deployment involved iterative testing with financial institutions, leveraging Mastercard's global network for robust validation while ensuring explainability to build issuer confidence.

Ergebnisse

  • 2x faster detection of potentially compromised cards
  • Up to 300% boost in fraud detection effectiveness
  • Doubled rate of proactive compromised card notifications
  • Significant reduction in fraudulent transactions post-detection
  • Minimized false declines on legitimate transactions
  • Real-time processing of billions of transactions
Read case study →

Khan Academy

Education

Khan Academy faced the monumental task of providing personalized tutoring at scale to its 100 million+ annual users, many in under-resourced areas. Traditional online courses, while effective, lacked the interactive, one-on-one guidance of human tutors, leading to high dropout rates and uneven mastery. Teachers were overwhelmed with planning, grading, and differentiation for diverse classrooms. In 2023, as AI advanced, educators grappled with hallucinations and over-reliance risks in tools like ChatGPT, which often gave direct answers instead of fostering learning. Khan Academy needed an AI that promoted step-by-step reasoning without cheating, while ensuring equitable access as a nonprofit. Scaling safely across subjects and languages posed technical and ethical hurdles.

Lösung

Khan Academy developed Khanmigo, an AI-powered tutor and teaching assistant built on GPT-4, piloted in March 2023 for teachers and expanded to students. Unlike generic chatbots, Khanmigo uses custom prompts to guide learners Socratically—prompting questions, hints, and feedback without direct answers—across math, science, humanities, and more. The nonprofit approach emphasized safety guardrails, integration with Khan's content library, and iterative improvements via teacher feedback. Partnerships like Microsoft enabled free global access for teachers by 2024, now in 34+ languages. Ongoing updates, such as 2025 math computation enhancements, address accuracy challenges.

Ergebnisse

  • User Growth: 68,000 (2023-24 pilot) to 700,000+ (2024-25 school year)
  • Teacher Adoption: Free for teachers in most countries, millions using Khan Academy tools
  • Languages Supported: 34+ for Khanmigo
  • Engagement: Improved student persistence and mastery in pilots
  • Time Savings: Teachers save hours on lesson planning and prep
  • Scale: Integrated with 429+ free courses in 43 languages
Read case study →

PayPal

Fintech

PayPal processes millions of transactions hourly, facing rapidly evolving fraud tactics from cybercriminals using sophisticated methods like account takeovers, synthetic identities, and real-time attacks. Traditional rules-based systems struggle with false positives and fail to adapt quickly, leading to financial losses exceeding billions annually and eroding customer trust if legitimate payments are blocked . The scale amplifies challenges: with 10+ million transactions per hour, detecting anomalies in real-time requires analyzing hundreds of behavioral, device, and contextual signals without disrupting user experience. Evolving threats like AI-generated fraud demand continuous model retraining, while regulatory compliance adds complexity to balancing security and speed .

Lösung

PayPal implemented deep learning models for anomaly and fraud detection, leveraging machine learning to score transactions in milliseconds by processing over 500 signals including user behavior, IP geolocation, device fingerprinting, and transaction velocity. Models use supervised and unsupervised learning for pattern recognition and outlier detection, continuously retrained on fresh data to counter new fraud vectors . Integration with H2O.ai's Driverless AI accelerated model development, enabling automated feature engineering and deployment. This hybrid AI approach combines deep neural networks for complex pattern learning with ensemble methods, reducing manual intervention and improving adaptability . Real-time inference blocks high-risk payments pre-authorization, while low-risk ones proceed seamlessly .

Ergebnisse

  • 10% improvement in fraud detection accuracy on AI hardware
  • $500M fraudulent transactions blocked per quarter (~$2B annually)
  • AUROC score of 0.94 in fraud models (H2O.ai implementation)
  • 50% reduction in manual review queue
  • Processes 10M+ transactions per hour with <0.4ms latency
  • <0.32% fraud rate on $1.5T+ processed volume
Read case study →

JPMorgan Chase

Banking

In the high-stakes world of asset management and wealth management at JPMorgan Chase, advisors faced significant time burdens from manual research, document summarization, and report drafting. Generating investment ideas, market insights, and personalized client reports often took hours or days, limiting time for client interactions and strategic advising. This inefficiency was exacerbated post-ChatGPT, as the bank recognized the need for secure, internal AI to handle vast proprietary data without risking compliance or security breaches. The Private Bank advisors specifically struggled with preparing for client meetings, sifting through research reports, and creating tailored recommendations amid regulatory scrutiny and data silos, hindering productivity and client responsiveness in a competitive landscape.

Lösung

JPMorgan addressed these challenges by developing the LLM Suite, an internal suite of seven fine-tuned large language models (LLMs) powered by generative AI, integrated with secure data infrastructure. This platform enables advisors to draft reports, generate investment ideas, and summarize documents rapidly using proprietary data. A specialized tool, Connect Coach, was created for Private Bank advisors to assist in client preparation, idea generation, and research synthesis. The implementation emphasized governance, risk management, and employee training through AI competitions and 'learn-by-doing' approaches, ensuring safe scaling across the firm. LLM Suite rolled out progressively, starting with proofs-of-concept and expanding firm-wide.

Ergebnisse

  • Users reached: 140,000 employees
  • Use cases developed: 450+ proofs-of-concept
  • Financial upside: Up to $2 billion in AI value
  • Deployment speed: From pilot to 60K users in months
  • Advisor tools: Connect Coach for Private Bank
  • Firm-wide PoCs: Rigorous ROI measurement across 450 initiatives
Read case study →

Best Practices

Successful implementations follow proven patterns. Have a look at our tactical advice to get started.

Use Multimodal Inputs: Combine Merchant, Text, and Receipt Images

Gemini’s strength for Finance is its ability to interpret multiple input types together. For each expense line, send merchant name, transaction amount, currency, employee free-text description, and a receipt image (where available). This helps Gemini distinguish, for example, between a hotel’s restaurant and room charges, or between personal and business items on the same receipt.

Structure your API payload to keep financial metadata separate from raw text, so prompts can explicitly reference them. Where your expense tool already extracts some OCR data from receipts, pass both the original image and the OCR text so Gemini can correct or enrich it.

Example Gemini prompt template (conceptual):
You are an expense classification assistant for the Finance department.
You receive: merchant, amount, date, employee description, and a receipt image.

Tasks:
1. Propose the most likely expense category (GL account) from this list: [...].
2. Propose cost center and project, if clearly inferable.
3. Flag potential policy violations (e.g. weekend spend, first-class travel).
4. Return a JSON object with fields: category_code, cost_center, project,
   policy_flags[], confidence_score, reasoning.

Now classify the following expense:
MERCHANT: {{merchant}}
AMOUNT: {{amount}} {{currency}}
DATE: {{date}}
DESCRIPTION: {{description}}
RECEIPT_IMAGE: <binary or URL>

Standardise Outputs with JSON Schemas and Confidence Thresholds

For reliable downstream posting, enforce a strict JSON schema for Gemini responses. Define required fields (e.g. category_code, confidence_score), permissible values (e.g. list of GL accounts), and default behaviours when the model is unsure.

On the consuming side (your middleware or expense tool), implement confidence thresholds. For example, if confidence_score >= 0.9, auto-apply the classification; if 0.7–0.9, route to a controller’s review queue with Gemini’s reasoning pre-filled; if < 0.7, keep the line as “uncategorized” but attach the suggestion for faster manual handling.

Expected Gemini response structure:
{
  "category_code": "6130_TRAVEL_HOTEL",
  "cost_center": "CC_102_MKT_DE",
  "project": "PRJ_4567_CAMPAIGN_Q4",
  "policy_flags": ["WEEKEND", "NO_APPROVAL_FOUND"],
  "confidence_score": 0.92,
  "reasoning": "Merchant is a hotel chain, date matches conference…"
}

Integrate Gemini into Existing Expense and ERP Workflows

Instead of building a separate tool, embed Gemini classification into the workflows your employees and controllers already use. Typical patterns include a middleware service between the expense app and ERP, or an API extension inside the expense management system.

Implementation sequence could look like this: (1) Employee submits an expense as usual; (2) Webhook triggers a Gemini API call with all inputs; (3) Middleware writes back suggested category, cost center, and flags into custom fields; (4) Approver or controller sees the pre-filled suggestions and can accept or adjust; (5) Final data posts to the ERP. This minimises change management and speeds up adoption.

Continuously Retrain Prompts with Feedback from Finance

Set up a simple loop where Finance feedback directly improves Gemini’s performance. Whenever a controller overrides a category or cost center, log both Gemini’s suggestion and the final choice. Regularly sample these overrides and update your prompt instructions (e.g. “for <merchant> in Germany, prefer cost center X unless description mentions Y”).

Even without model fine-tuning, iterative prompt refinement can move accuracy from “roughly helpful” to “good enough to automate most cases”. Schedule monthly prompt reviews with Finance and your technical team to adjust rules, add new merchant patterns, and refine policy checks.

Prompt refinement snippet:
Previous errors: Taxi rides for client visits were coded as "office travel".
Update instruction:
- If description contains words like "client", "customer", "meeting" and
  merchant type is taxi/ride-sharing, classify under
  6145_CLIENT_VISIT_TRAVEL instead of 6140_INTERNAL_TRAVEL.

Use Gemini for Policy Violation Detection and Anomaly Flags

Beyond basic categorisation, use Gemini to flag policy violations and anomalies in real time. For example, detect first-class or business-class fares from the ticket image, weekend or holiday expenses, or duplicate receipts submitted across different reports.

Design prompts that explicitly ask Gemini to reason about context: time of day, location vs. employee office, unusual amounts relative to typical spend for that merchant. Route flagged items into a specific queue with clear labels so controllers can quickly decide whether to approve, reject, or request more information.

Example policy check snippet:
In addition to classification, check for:
- Travel class (economy vs business/first) from the ticket or receipt.
- Weekend or public holiday dates.
- Multiple similar receipts in a short time window.
Return a "policy_flags" array with reasons, e.g. [
  "BUSINESS_CLASS_FLIGHT",
  "WEEKEND_EXPENSE"
]

Define KPIs and Dashboards to Track Impact

To prove value, track a small set of expense automation KPIs before and after Gemini deployment. Common metrics include: percentage of uncategorized lines, average time from submission to posting, manual touch rate per expense, and reclassification rate after monthly close.

Visualise these KPIs in your existing BI or ERP reporting, ideally by department and country. This helps you see where the model performs well, where additional training is needed, and where process issues (not AI) are causing delays. Use these insights to prioritise future improvements and justify further investment in AI-enabled Finance workflows.

Implemented pragmatically, Finance teams typically see 30–60% fewer uncategorized entries within the first 2–3 months, a 20–40% reduction in manual expense coding effort, and materially faster visibility into spend by cost center and project—enough to make a noticeable difference in month-end closing and budget steering.

Need implementation expertise now?

Let's talk about your ideas!

Frequently Asked Questions

Gemini combines text and image understanding to interpret what an expense really represents. It reads merchant names, employee descriptions, transaction amounts, and even receipt photos, then maps them to your predefined expense categories, cost centers, and projects.

Instead of employees guessing a category or choosing “Other”, Gemini proposes a concrete GL account and coding suggestion in real time. Finance can review and adjust these suggestions, and over time you can automate high-confidence cases, dramatically reducing the number of uncategorized lines that reach controllers.

You typically need three capabilities: (1) a Finance lead who understands your chart of accounts and expense policies, (2) a technical owner who can work with APIs or your expense tool’s integration layer, and (3) someone to handle security, data protection, and access control.

You do not need a large data science team to get started. A small cross-functional squad—Finance, IT, and one engineer—can set up a first Gemini-based classification pilot in a few weeks, especially if you use existing middleware or iPaaS tools as the integration backbone.

For a focused use case (e.g. travel expenses), organisations usually see tangible results within 4–8 weeks. In the first weeks, you’ll configure prompts, wire up the API, and run Gemini in suggestion-only mode alongside your current process.

Once Finance is comfortable with the quality of suggestions, you can start auto-applying high-confidence classifications. At that point, you should see a clear reduction in uncategorized entries and manual coding time. Expanding to additional spend categories is typically faster, because you can reuse the same technical foundation and only adapt the financial logic.

Cost has two components: (1) Gemini usage costs (API calls, which are typically low per transaction) and (2) your one-time integration and change effort. For most mid-sized and larger organisations, the main driver is internal or partner implementation time, not the model usage cost.

ROI comes from reduced manual classification work, faster month-end closing, fewer reclassifications, and better visibility into spend (enabling concrete savings actions in travel, procurement, and subscriptions). In practice, even a modest reduction of a few FTE-days per month plus avoided mispostings and better spend steering usually outweighs the ongoing cost of running Gemini.

Reruption supports you end-to-end, from idea to working solution. With our AI PoC offering (9,900€), we first validate that Gemini can reliably classify your real expense data: we define the use case, build a prototype that connects to sample receipts and ledger data, and measure accuracy, speed, and cost per run.

Beyond the PoC, our Co-Preneur approach means we embed with your Finance and IT teams, design the target workflow, implement the Gemini integration, and help set up KPIs, monitoring, and governance. We don’t stop at slides—we build and ship the actual automation inside your P&L so that uncategorized expenses become the exception, not the norm.

Contact Us!

0/10 min.

Contact Directly

Your Contact

Philipp M. W. Hoffmann

Founder & Partner

Address

Reruption GmbH

Falkertstraße 2

70176 Stuttgart

Social Media