Medicare Rate Analysis

THE PROBLEM

Healthcare pricing is opaque by design

Hospitals set their own chargemasters. Medicare pays a fraction of what providers bill. But the gap between charges and payments varies wildly by state, procedure, and provider. For payers benchmarking commercial rates against Medicare, understanding this spread is the foundation of contracting strategy. The question is: where is the markup highest, and what drives it?

APPROACH

Two datasets, four notebooks, one story

I combined the CMS Medicare Inpatient Hospital PUF (145,742 provider-DRG records) with the Physician and Other Practitioners PUF (9.76M provider-HCPCS records) to build a complete picture of Medicare payment variation.

Inpatient

145,742 records

3,015 hospitals, 533 DRGs, 51 states. Provider-level charges, payments, and discharges.

Physician

9.76M records

1.15M NPIs, 6,326 HCPCS codes, 103 specialties. Submitted charges, allowed amounts, and payments.

Python pandas matplotlib CMS Public Data DRG Analysis HCPCS Variation Rate Benchmarking

FINDING 1

Septicemia drives 11% of all inpatient volume

The top 5 DRGs account for over 25% of all Medicare inpatient discharges. Septicemia alone represents 550,000 stays. Any value-based contract that doesn't address these conditions is leaving the highest-volume diagnoses unmanaged.

Top 15 DRGs by discharge volume, CY 2022

FINDING 2

Heart transplants cost $295K per case

The costliest DRGs are surgical and transplant-related. For commercial payers benchmarking against Medicare, these are the procedures where a 10% rate difference translates to $20K-$30K per case.

Top 15 costliest DRGs by average Medicare payment

FINDING 3

The 5.4x markup gap

Nationally, hospitals charge 5.4x what Medicare pays (median). The scatter plot shows a nonlinear relationship: lower-cost procedures have the widest markup spread, while high-cost DRGs compress toward smaller ratios. The distribution is right-skewed, with a minority of hospitals charging 8-12x Medicare rates.

Submitted charges vs. Medicare payment

Distribution of inpatient markup ratios

Why this matters for payers

Chargemaster-based contracts systematically overpay relative to Medicare. Percent-of-Medicare benchmarks are more defensible and transparent for rate negotiations.

FINDING 4

Maryland 1.2x vs. Nevada 10.7x

The 9.5x spread in median markup between Maryland and Nevada is not random. Maryland's all-payer rate-setting system compresses the charge-to-payment gap by regulation. States without rate controls show far higher markups. California, Texas, and Florida combine high volume with above-median markups, making them priority markets for rate renegotiation.

Inpatient markup ratio by state with IQR whiskers

Discharge volume and average payment by state

FINDING 5

Specialty markup varies 2x to 5x

On the physician side, the volume-cost mismatch is a strategic signal. Lab and radiology drive spend through utilization; surgery drives it through unit price. Different levers apply: utilization management for the former, rate negotiation for the latter.

Markup ratio by specialty (top 20)

Highest payment variation by procedure

SO WHAT

Strategic implications

Finding	Implication
Septicemia = 11% of volume	Bundled payment and readmission reduction programs should prioritize sepsis pathways
Median markup = 5.4x	Chargemaster-based contracts overpay; percent-of-Medicare benchmarks are more defensible
MD vs NV spread = 9.5x	State regulatory environment is a first-order variable; multi-state payers need state-specific playbooks
High-variation HCPCS codes	Procedure-level benchmarking can surface outlier providers billing 3-5x peers
Volume-cost specialty mismatch	Utilization management for high-volume specialties; rate negotiation for high-cost ones

METHODS

How I built this

Data was downloaded directly from CMS public endpoints using a custom Python downloader. The inpatient file required Latin-1 encoding (not UTF-8). Both files were deduplicated on natural keys (CCN+DRG for inpatient, NPI+HCPCS+Place of Service for physician) with zero duplicates found. Physician data was filtered to US-only providers. A derived markup ratio (submitted charges / Medicare payment) was computed for both datasets. All analysis was done in four self-contained Jupyter notebooks with 14 figures saved at 150 DPI.

Data Cleaning Deduplication Geographic Analysis Rate Benchmarking DRG/HCPCS Coding Two-Dataset Merge

Tech Stack

Built with

CMS Inpatient and Physician PUFs, 10M+ records. Pandas‑driven aggregation, volume‑weighted state and procedure metrics, and a state‑drilldown Plotly.js choropleth.

Python pandas NumPy Plotly.js SQL Jupyter Claude

Medicare Rate &Utilization Analysis