The Challenge: Manual Narrative Commentary

Every reporting cycle, finance teams are pulled into the same grind: extracting numbers from ERP and spreadsheets, hunting down drivers, and then manually drafting pages of variance explanations and management commentary. Analysts copy last quarter’s text, tweak a few numbers, and hope they haven’t missed a material change hidden in a pivot table. The result is a slow, fragile process that depends heavily on individual heroes and late nights.

Traditional approaches – Excel comments, Word templates, and email chains – simply don’t scale with today’s reporting complexity. As data volumes grow and stakeholders demand more granular insights, manually stitching together commentary from multiple sources breaks down. Even with business intelligence tools, the final mile of reporting – turning data into coherent narrative – is still handled in PowerPoint and Word by highly qualified finance staff doing copy-paste work.

The business impact is significant. Reporting cycles stretch from days into weeks, delaying decisions and frustrating leadership. Analysts spend more time writing around the numbers than analysing them, which means root causes and risks can be missed. Leaders challenge the commentary in meetings because it feels generic, lacks clear drivers, or is inconsistent across units and periods. Over time, this erodes confidence in finance and keeps the function stuck in a reporting role instead of becoming a strategic partner.

The good news: this is a highly solvable problem. Modern large language models like Claude can read complex tables, compare periods, and generate precise narrative commentary that reflects your own policies and tone. At Reruption, we’ve seen how the right AI setup can turn days of manual writing into a structured workflow that runs in hours – without losing control or quality. The rest of this page walks through how to approach this transformation and what to watch out for when you bring AI into your financial reporting stack.

Need a sparring partner for this challenge?

Let's have a no-obligation chat and brainstorm together.

Innovators at these companies trust us:

Our Assessment

A strategic assessment of the challenge and high-level tips how to tackle it.

From Reruption’s perspective, Claude for financial narrative commentary is one of the most effective entry points for automating finance work. Its strength with long documents and tables makes it well suited to turn ERP exports, management reports and spreadsheet tabs into draft variance explanations at scale. Based on our hands-on experience building AI solutions for complex, document-heavy processes, the key is not just using Claude, but embedding it into a clear reporting workflow with the right guardrails, prompts and review steps.

Frame Narrative Automation as an Extension of Your Control Framework

Automating commentary is not just a writing shortcut; it touches your financial controls, materiality thresholds and sign-off processes. Before you roll out Claude, define where AI is allowed to operate: which reports, which sections, and what level of judgement it can apply. For example, Claude can draft descriptions of variances but final interpretation and tone should remain with a human reviewer in the first phase.

Work with Controlling, Accounting and Internal Audit to codify these boundaries. Treat Claude as part of your reporting control environment: define input data sources, review steps, and evidence you will keep (e.g. prompts and outputs stored alongside the report). This framing calms stakeholder concerns and ensures you don’t trade speed for governance.

Start with a Narrow, High-Repetition Reporting Use Case

To build confidence, begin with a specific, repetitive area where manual narrative commentary clearly slows you down: for example, monthly P&L commentary by cost center, or revenue variance explanations for a single business unit. These are areas where your team already follows a de facto template, even if it lives in people’s heads and old PowerPoints.

Use this pilot to learn how Claude handles your chart of accounts, typical variance drivers, and preferred wording. Keep the first scope deliberately narrow, but end-to-end: from data extraction to final human approval. Once that loop is working reliably, scale to additional entities, periods, and report types.

Invest Early in Data Preparation, Not Just Prompt Design

Claude is powerful, but it cannot fix messy inputs. If your financial tables, ERP exports and spreadsheets are inconsistent, the model will struggle to produce reliable commentary. Strategically, it’s worth investing in a thin data preparation layer that standardises column names, measures, and structures across entities and periods before anything reaches Claude.

This doesn’t require a full data warehouse project. Simple steps like defining a common layout for P&L and balance sheet exports, consistent naming for business units, and standard variance thresholds will dramatically improve output quality. Reruption typically designs this layer alongside prompt logic so that finance does not become dependent on IT backlogs.

Prepare Your Finance Team to Think in “Tasks for AI”, Not “Jobs for Humans”

Introducing Claude into reporting changes how your finance team thinks about their work. Strategically, you should help analysts break their jobs into discrete tasks that AI can support: summarise variances, normalise one-offs, compare against budget, suggest narrative structure, etc. This mindset shift turns AI from a black box into an assistant they can orchestrate.

Invest a few focused sessions to show team members how prompts work, how to critique AI outputs, and how to turn their own heuristics into instructions for Claude. When analysts learn to decompose “write commentary” into a chain of smaller AI-supported tasks, adoption increases and resistance drops.

Define Clear Success Metrics and a Risk Playbook Upfront

From a strategic perspective, using Claude for automated financial reporting should be measured like any other investment. Define concrete KPIs before you start: reduction in cycle time, analyst hours saved per reporting round, share of commentary first-drafted by AI, and error rates in narrative vs numbers. Track these against a baseline to build an evidence-based business case, not just anecdotes.

At the same time, agree on a simple risk playbook: in which cases do you revert to manual commentary? What types of errors are acceptable (stylistic) vs critical (misstated drivers)? How will you monitor the system? Having these boundaries written down gives leadership comfort and allows you to experiment with Claude without jeopardising trust in your numbers.

Used with the right governance and data preparation, Claude can turn manual narrative commentary from a bottleneck into a fast, controlled workflow. Finance teams stay in charge of judgement while the model handles the heavy lifting of reading tables, comparing periods and drafting well-structured explanations. Reruption combines deep engineering with a Co-Preneur mindset to design and embed these workflows directly into your reporting cycle; if you want to explore what this could look like in your organisation, we’re ready to help you test it in a focused, low-risk setup.

Need help implementing these ideas?

Feel free to reach out to us with no obligation.

Real-World Case Studies

From Healthcare to Banking: Learn how companies successfully use Claude.

Kaiser Permanente

Healthcare

In hospital settings, adult patients on general wards often experience clinical deterioration without adequate warning, leading to emergency transfers to intensive care, increased mortality, and preventable readmissions. Kaiser Permanente Northern California faced this issue across its network, where subtle changes in vital signs and lab results went unnoticed amid high patient volumes and busy clinician workflows. This resulted in elevated adverse outcomes, including higher-than-necessary death rates and 30-day readmissions . Traditional early warning scores like MEWS (Modified Early Warning Score) were limited by manual scoring and poor predictive accuracy for deterioration within 12 hours, failing to leverage the full potential of electronic health record (EHR) data. The challenge was compounded by alert fatigue from less precise systems and the need for a scalable solution across 21 hospitals serving millions .

Lösung

Kaiser Permanente developed the Advance Alert Monitor (AAM), an AI-powered early warning system using predictive analytics to analyze real-time EHR data—including vital signs, labs, and demographics—to identify patients at high risk of deterioration within the next 12 hours. The model generates a risk score and automated alerts integrated into clinicians' workflows, prompting timely interventions like physician reviews or rapid response teams . Implemented since 2013 in Northern California, AAM employs machine learning algorithms trained on historical data to outperform traditional scores, with explainable predictions to build clinician trust. It was rolled out hospital-wide, addressing integration challenges through Epic EHR compatibility and clinician training to minimize fatigue .

Ergebnisse

  • 16% lower mortality rate in AAM intervention cohort
  • 500+ deaths prevented annually across network
  • 10% reduction in 30-day readmissions
  • Identifies deterioration risk within 12 hours with high reliability
  • Deployed in 21 Northern California hospitals
Read case study →

Pfizer

Healthcare

The COVID-19 pandemic created an unprecedented urgent need for new antiviral treatments, as traditional drug discovery timelines span 10-15 years with success rates below 10%. Pfizer faced immense pressure to identify potent, oral inhibitors targeting the SARS-CoV-2 3CL protease (Mpro), a key viral enzyme, while ensuring safety and efficacy in humans. Structure-based drug design (SBDD) required analyzing complex protein structures and generating millions of potential molecules, but conventional computational methods were too slow, consuming vast resources and time. Challenges included limited structural data early in the pandemic, high failure risks in hit identification, and the need to run processes in parallel amid global uncertainty. Pfizer's teams had to overcome data scarcity, integrate disparate datasets, and scale simulations without compromising accuracy, all while traditional wet-lab validation lagged behind.

Lösung

Pfizer deployed AI-driven pipelines leveraging machine learning (ML) for SBDD, using models to predict protein-ligand interactions and generate novel molecules via generative AI. Tools analyzed cryo-EM and X-ray structures of the SARS-CoV-2 protease, enabling virtual screening of billions of compounds and de novo design optimized for binding affinity, pharmacokinetics, and synthesizability. By integrating supercomputing with ML algorithms, Pfizer streamlined hit-to-lead optimization, running parallel simulations that identified PF-07321332 (nirmatrelvir) as the lead candidate. This lightspeed approach combined ML with human expertise, reducing iterative cycles and accelerating from target validation to preclinical nomination.

Ergebnisse

  • Drug candidate nomination: 4 months vs. typical 2-5 years
  • Computational chemistry processes reduced: 80-90%
  • Drug discovery timeline cut: From years to 30 days for key phases
  • Clinical trial success rate boost: Up to 12% (vs. industry ~5-10%)
  • Virtual screening scale: Billions of compounds screened rapidly
  • Paxlovid efficacy: 89% reduction in hospitalization/death
Read case study →

Associated Press (AP)

News Media

In the mid-2010s, the Associated Press (AP) faced significant constraints in its business newsroom due to limited manual resources. With only a handful of journalists dedicated to earnings coverage, AP could produce just around 300 quarterly earnings reports per quarter, primarily focusing on major S&P 500 companies. This manual process was labor-intensive: reporters had to extract data from financial filings, analyze key metrics like revenue, profits, and growth rates, and craft concise narratives under tight deadlines. As the number of publicly traded companies grew, AP struggled to cover smaller firms, leaving vast amounts of market-relevant information unreported. This limitation not only reduced AP's comprehensive market coverage but also tied up journalists on rote tasks, preventing them from pursuing investigative stories or deeper analysis. The pressure of quarterly earnings seasons amplified these issues, with deadlines coinciding across thousands of companies, making scalable reporting impossible without innovation.

Lösung

To address this, AP partnered with Automated Insights in 2014, implementing their Wordsmith NLG platform. Wordsmith uses templated algorithms to transform structured financial data—such as earnings per share, revenue figures, and year-over-year changes—into readable, journalistic prose. Reporters input verified data from sources like Zacks Investment Research, and the AI generates draft stories in seconds, which humans then lightly edit for accuracy and style. The solution involved creating custom NLG templates tailored to AP's style, ensuring stories sounded human-written while adhering to journalistic standards. This hybrid approach—AI for volume, humans for oversight—overcame quality concerns. By 2015, AP announced it would automate the majority of U.S. corporate earnings stories, scaling coverage dramatically without proportional staff increases.

Ergebnisse

  • 14x increase in quarterly earnings stories: 300 to 4,200
  • Coverage expanded to 4,000+ U.S. public companies per quarter
  • Equivalent to freeing time of 20 full-time reporters
  • Stories published in seconds vs. hours manually
  • Zero reported errors in automated stories post-implementation
  • Sustained use expanded to sports, weather, and lottery reports
Read case study →

Upstart

Banking

Traditional credit scoring relies heavily on FICO scores, which evaluate only a narrow set of factors like payment history and debt utilization, often rejecting creditworthy borrowers with thin credit files, non-traditional employment, or education histories that signal repayment ability. This results in up to 50% of potential applicants being denied despite low default risk, limiting lenders' ability to expand portfolios safely . Fintech lenders and banks faced the dual challenge of regulatory compliance under fair lending laws while seeking growth. Legacy models struggled with inaccurate risk prediction amid economic shifts, leading to higher defaults or conservative lending that missed opportunities in underserved markets . Upstart recognized that incorporating alternative data could unlock lending to millions previously excluded.

Lösung

Upstart developed an AI-powered lending platform using machine learning models that analyze over 1,600 variables, including education, job history, and bank transaction data, far beyond FICO's 20-30 inputs. Their gradient boosting algorithms predict default probability with higher precision, enabling safer approvals . The platform integrates via API with partner banks and credit unions, providing real-time decisions and fully automated underwriting for most loans. This shift from rule-based to data-driven scoring ensures fairness through explainable AI techniques like feature importance analysis . Implementation involved training models on billions of repayment events, continuously retraining to adapt to new data patterns .

Ergebnisse

  • 44% more loans approved vs. traditional models
  • 36% lower average interest rates for borrowers
  • 80% of loans fully automated
  • 73% fewer losses at equivalent approval rates
  • Adopted by 500+ banks and credit unions by 2024
  • 157% increase in approvals at same risk level
Read case study →

Duke Health

Healthcare

Sepsis is a leading cause of hospital mortality, affecting over 1.7 million Americans annually with a 20-30% mortality rate when recognized late. At Duke Health, clinicians faced the challenge of early detection amid subtle, non-specific symptoms mimicking other conditions, leading to delayed interventions like antibiotics and fluids. Traditional scoring systems like qSOFA or NEWS suffered from low sensitivity (around 50-60%) and high false alarms, causing alert fatigue in busy wards and EDs. Additionally, integrating AI into real-time clinical workflows posed risks: ensuring model accuracy on diverse patient data, gaining clinician trust, and complying with regulations without disrupting care. Duke needed a custom, explainable model trained on its own EHR data to avoid vendor biases and enable seamless adoption across its three hospitals.

Lösung

Duke's Sepsis Watch is a deep learning model leveraging real-time EHR data (vitals, labs, demographics) to continuously monitor hospitalized patients and predict sepsis onset 6 hours in advance with high precision. Developed by the Duke Institute for Health Innovation (DIHI), it triggers nurse-facing alerts (Best Practice Advisories) only when risk exceeds thresholds, minimizing fatigue. The model was trained on Duke-specific data from 250,000+ encounters, achieving AUROC of 0.935 at 3 hours prior and 88% sensitivity at low false positive rates. Integration via Epic EHR used a human-centered design, involving clinicians in iterations to refine alerts and workflows, ensuring safe deployment without overriding clinical judgment.

Ergebnisse

  • AUROC: 0.935 for sepsis prediction 3 hours prior
  • Sensitivity: 88% at 3 hours early detection
  • Reduced time to antibiotics: 1.2 hours faster
  • Alert override rate: <10% (high clinician trust)
  • Sepsis bundle compliance: Improved by 20%
  • Mortality reduction: Associated with 12% drop in sepsis deaths
Read case study →

Best Practices

Successful implementations follow proven patterns. Have a look at our tactical advice to get started.

Standardise Your Reporting Inputs for Claude

Before you ask Claude to write commentary, ensure your inputs are consistent. Export P&L, balance sheet and cash flow statements from your ERP in a standardised, table-based layout: the same column order, metric names and sign conventions every period. Where possible, enrich these exports with additional fields (e.g. cost center owner, segment, region) so Claude has context for explanations.

Then, wrap these exports with a short textual header that explains the report type, period, and comparison baseline (e.g. vs. budget, vs. last year). This gives Claude both structured and unstructured cues to work with, improving the accuracy of its variance analysis.

Use a Modular Prompt Template for Variance Commentary

Instead of writing a new prompt every month, define a reusable template for variance analysis and narrative generation. Structure the prompt in modules: role, task, data description, style guidelines, and output format. Here is an example pattern for monthly P&L commentary:

You are a senior finance analyst preparing monthly management commentary.

Task:
- Analyse the provided P&L tables for the current period vs. comparison period.
- Identify the top 5 positive and top 5 negative variances by absolute value and by %.
- For each, explain the main driver based on line item, segment and region.
- Distinguish between structural effects (e.g. headcount changes) and one-offs.

Data:
- You will receive P&L data as tables exported from our ERP.
- Column "Period" indicates current vs comparison.
- Column "Scenario" indicates Actual, Budget, Forecast.

Style:
- Write in concise, neutral management language.
- Avoid speculation; only use drivers that can be inferred from the data.
- Use bullet points for variance lists, then a short narrative summary (max 300 words).

Output format:
1) Short executive summary (max 5 sentences).
2) Bullet list of key variances with numbers.
3) Narrative commentary section suitable for our monthly report.

Here is the data:
[PASTE TABLES HERE]

Store this template in your reporting playbook or internal tool so analysts use a consistent approach each period. Over time, refine it with your own terminology and recurring drivers.

Chain Tasks: From Raw Data to Final Commentary

For reliable results, break the workflow into distinct steps instead of asking Claude to “do everything at once”. A robust chain for automated financial reporting narratives could look like this:

Step 1 – Data check: Ask Claude to verify that periods, currencies, and totals reconcile, and to flag obvious inconsistencies.

First, review the tables and check:
- Do total revenue and total expenses add up correctly?
- Are the periods and currencies consistent?
- Are there any missing or duplicate lines?

Respond with a short diagnostic summary before you start any commentary.

Step 2 – Variance extraction: Have Claude produce a structured list of significant variances above a defined threshold (e.g. >5% and >€50k). Step 3 – Narrative drafting: Feed this structured variance list back into Claude with a second prompt focused purely on writing, not analysis.

Using ONLY the validated variance list below, write management commentary as described earlier. Do not invent new variances or drivers.

This chaining approach reduces hallucinations and makes it easier for analysts to review each step.

Implement Human-in-the-Loop Review with Checklists

Claude should draft, not decide. Build a simple review checklist for analysts so the human-in-the-loop step is systematic, not ad hoc. The checklist might include: verify numbers vs source report, confirm that all material variances are covered, check that one-offs are clearly labelled, and align tone with corporate guidelines.

You can even ask Claude to generate a suggested checklist from your reporting policies:

You are a reporting quality controller. Based on the following internal reporting guidelines, create a 10-point checklist to review monthly variance commentary.

[PASTE YOUR GUIDELINES]

Embed this checklist into your reporting workflow tool or close process so commentary is always signed off against the same criteria.

Fine-Tune Language and Tone with Style Snippets

Many finance teams want commentary that “sounds like us”. Collect a few examples of high-quality narratives from previous reports and turn them into style snippets. Feed these to Claude as examples so it can mimic your preferred tone, structure and phrasing.

You are writing in the style of our existing management reports.
Here are 3 examples of good commentary. Learn the tone, structure and phrasing:

[EXAMPLE 1]
[EXAMPLE 2]
[EXAMPLE 3]

Now, using the variance list below, write new commentary in the same style.

Update these examples periodically to reflect new leadership preferences or changes in reporting focus (e.g. stronger emphasis on cash or ESG).

Log Prompts and Outputs for Auditability and Continuous Improvement

For finance, auditability and traceability are essential. Implement a simple logging mechanism that stores prompts, inputs (tables), and Claude’s outputs together with the final, human-approved version. This can be as lightweight as a dedicated SharePoint/Drive structure or as integrated as a custom internal tool.

Review these logs quarterly to identify common edits analysts make to AI drafts. Feed those patterns back into your prompt templates or data preparation rules. Over time, this continuous improvement loop will increase the share of commentary that is “right first time” and reduce the review burden.

Executed in this way, using Claude for automated narrative commentary in finance can realistically cut drafting time by 50–70%, shorten reporting cycles by 1–3 days, and free up analyst capacity for deeper analysis – while keeping control, compliance and auditability firmly in place.

Need implementation expertise now?

Let's talk about your ideas!

Frequently Asked Questions

Claude is very strong at reading tables, comparing periods and producing coherent commentary, but it must work within a controlled process. In our experience, when it receives clean, standardised P&L and balance sheet exports plus a clear prompt, 70–80% of the draft narrative is usable with only light edits. The remaining 20–30% typically requires adjustment for nuances, internal language and business context.

Crucially, Claude should not be treated as a replacement for finance judgement. It should draft explanations based on data you provide, and your analysts remain responsible for validating numbers, drivers and tone. With a human-in-the-loop review and clear thresholds for materiality, teams can gain speed without compromising quality or control.

You don’t need a full data science team to start. The core requirements are: a finance lead who understands your reporting process, someone who can access and standardise ERP/spreadsheet exports, and basic technical support to integrate Claude into your existing tools (e.g. via API, internal web app, or even structured copy-paste workflows).

On the skills side, your analysts should learn how to write and refine prompts, how to review AI outputs critically, and how to break the job of “writing commentary” into smaller AI-supported tasks. Reruption typically supports clients by designing the prompts, the data preparation layer and the workflow, so finance can operate the system without depending on a large IT project.

For a focused use case such as monthly P&L commentary for a single business unit, you can usually see tangible benefits within one or two reporting cycles. A well-scoped pilot can be designed, prototyped and tested in a matter of weeks, not months, especially if your ERP exports are already available in a consistent format.

The first cycle is typically used to set up prompts, refine inputs and validate outputs alongside your existing manual process. By the second or third cycle, many teams are comfortable letting Claude draft the first version of commentary, with analysts focusing on review and deeper analysis. Broader rollout across entities and report types can then follow based on these early learnings.

The direct ROI comes from reducing the time your finance team spends on low-value writing work. For many organisations, analysts and controllers spend several person-days per month drafting and updating commentary. With Claude handling the initial draft, teams often recover 50–70% of that time, which can be redirected to scenario analysis, forecasting and business partnering.

There are also indirect benefits: shorter reporting cycles, more consistent commentary across units, fewer last-minute corrections, and better insight quality for management. When you factor in these gains, the cost of running Claude – whether via API or an integrated tool – is typically small compared to the value of the hours and decision quality you gain back.

Reruption works as a Co-Preneur alongside your finance team to turn the idea of AI-generated narrative commentary into a working solution. Our AI PoC offering (9,900€) is designed to test your specific use case quickly: we define the reporting scope, design prompts and workflows, connect to your existing data exports, and deliver a functioning prototype that generates commentary for real periods.

Beyond the PoC, we help you embed the solution into your close and reporting cycle: designing the data preparation layer, integrating Claude into your existing tools, and setting up governance, logging and review processes. Because we operate with entrepreneurial ownership and deep engineering capability, we don’t stop at decks – we stay until your finance team has a reliable, repeatable AI-supported reporting workflow in production.

Contact Us!

0/10 min.

Contact Directly

Your Contact

Philipp M. W. Hoffmann

Founder & Partner

Address

Reruption GmbH

Falkertstraße 2

70176 Stuttgart

Social Media