Key Facts

  • Company: NYU Langone Health
  • Company Size: 51,000+ employees, 370+ care sites, $8B+ annual revenue
  • Location: New York City, NY
  • AI Tool Used: NYUTron (custom 13B-parameter clinical LLM)
  • Outcome Achieved: <strong>AUROC improvements of 5-15%</strong> over benchmarks on 24+ predictive tasks; deployed at health system scale

Want to achieve similar results with AI?

Let us help you identify and implement the right AI solutions for your business.

The Challenge

NYU Langone Health, a leading academic medical center, faced significant hurdles in leveraging the vast amounts of unstructured clinical notes generated daily across its network. Traditional clinical predictive models relied heavily on structured data like lab results and vitals, but these required complex ETL processes that were time-consuming and limited in scope.[1] Unstructured notes, rich with nuanced physician insights, were underutilized due to challenges in natural language processing, hindering accurate predictions of critical outcomes such as in-hospital mortality, length of stay (LOS), readmissions, and operational events like insurance denials.[6]

Clinicians needed real-time, scalable tools to identify at-risk patients early, but existing models struggled with the volume and variability of EHR data—over 4 million notes spanning a decade. This gap led to reactive care, increased costs, and suboptimal patient outcomes, prompting the need for an innovative approach to transform raw text into actionable foresight.[2]

The Solution

To address these challenges, NYU Langone's Division of Applied AI Technologies at the Center for Healthcare Innovation and Delivery Science developed NYUTron, a proprietary large language model (LLM) specifically trained on internal clinical notes. Unlike off-the-shelf models, NYUTron was fine-tuned on unstructured EHR text from millions of encounters, enabling it to serve as an all-purpose prediction engine for diverse tasks.[6]

The solution involved pre-training a 13-billion-parameter LLM on over 10 years of de-identified notes (approximately 4.8 million inpatient notes), followed by task-specific fine-tuning. This allowed seamless integration into clinical workflows, automating risk flagging directly from physician documentation without manual data structuring.[3] Collaborative efforts, including AI 'Prompt-a-Thons,' accelerated adoption by engaging clinicians in model refinement.[4]

Quantitative Results

  • AUROC: <strong>0.961</strong> for 48-hour mortality prediction (vs. 0.938 benchmark)
  • <strong>92%</strong> accuracy in identifying high-risk patients from notes
  • LOS prediction AUROC: <strong>0.891</strong> (5.6% improvement over prior models)
  • Readmission prediction: <strong>AUROC 0.812</strong>, outperforming clinicians in some tasks
  • Operational predictions (e.g., insurance denial): <strong>AUROC up to 0.85</strong>
  • <strong>24 clinical tasks</strong> with superior performance across mortality, LOS, and comorbidities

Ready to transform your business with AI?

Book a free consultation to explore how AI can solve your specific challenges.

Implementation Details

Data Preparation and Model Pre-Training

Implementation began with curating a massive dataset of de-identified clinical notes from NYU Langone's EHR system, encompassing over 10 years of data and nearly 4.8 million inpatient notes. This dataset, totaling billions of tokens, was processed to train the foundational NYUTron LLM, a 13-billion-parameter model adapted from advanced architectures like GPT-J. Rigorous preprocessing ensured compliance with HIPAA and removed PII, enabling safe domain-specific pre-training on medical language.[6]

The pre-training phase focused on next-token prediction, allowing the model to internalize clinical narratives, jargon, and patterns unique to NYU Langone's workflows. This step was computationally intensive, leveraging high-performance GPUs at the health system's data center.[1]

Fine-Tuning for Predictive Tasks

Post pre-training, NYUTron was fine-tuned on 24 benchmarked clinical prediction tasks, including in-hospital mortality, prolonged LOS, 30-day readmission, and even operational metrics like payer-specific denial rates. Supervised fine-tuning used labeled outcomes from historical data, achieving AUROC scores 3-15% higher than traditional ML baselines and clinician judgments in blinded tests. For instance, 48-hour mortality prediction reached 0.961 AUROC.[6][2]

Challenges like data imbalance were overcome via techniques such as synthetic data augmentation and ensemble methods, ensuring robustness across patient demographics.[3]

Deployment and Integration

NYUTron was deployed at health system scale via APIs integrated into Epic EHR, providing real-time risk scores embedded in clinician dashboards. A low-resistance rollout included pilot testing on high-volume units, followed by full production in 2023. The model's interpretability was enhanced with attention visualizations, aiding clinician trust.[5]

Ethical safeguards, including bias audits and human oversight loops, were implemented. Ongoing monitoring via the Division of Applied AI tracks performance drift, with retraining cycles planned quarterly.[1]

Overcoming Challenges

Key hurdles like computational costs and regulatory approval were addressed through internal cloud infrastructure and IRB approvals. Clinician buy-in was boosted via Prompt-a-Thons, where staff co-developed prompts for tasks like note summarization, expanding NYUTron's utility beyond predictions to documentation automation.[4] This phased approach—from research prototype to live deployment—took under 18 months, setting a benchmark for academic health systems.

Interested in AI for your industry?

Discover how we can help you implement similar solutions.

Results

The deployment of NYUTron has transformed predictive analytics at NYU Langone, delivering superior performance across a spectrum of tasks. In blinded evaluations, it achieved an AUROC of 0.961 for 48-hour mortality—outpacing clinician accuracy by 4.3%—and 0.891 for prolonged LOS, a 5.6% uplift over prior state-of-the-art models. For 30-day readmissions, scores reached 0.812 AUROC, enabling proactive interventions that reduced unexpected returns.[6][2] Operationally, NYUTron flags insurance denials with 85% accuracy and forecasts comorbidities, optimizing resource allocation across NYU Langone's 370+ sites. Early results show potential for 10-20% reductions in LOS for high-risk cohorts, translating to millions in savings and improved patient safety. Clinicians report 30% time savings on risk assessments, freeing focus for care.[3] As of 2025, NYUTron is fully operational, powering daily decisions and inspiring extensions like precision education tools. Its success underscores the power of custom clinical LLMs, with ongoing expansions to outpatient notes and multimodal data, positioning NYU Langone as a leader in AI-driven healthcare.[1][5]

Contact Us!

0/10 min.

Contact Directly

Your Contact

Philipp M. W. Hoffmann

Founder & Partner

Address

Reruption GmbH

Falkertstraße 2

70176 Stuttgart

Social Media