atls№ 01 — May 2026

Agentic Trauma Life Support

Qwen2.5-VL-72B
in full BF16,
on a single AMD MI300X.

Agentic trauma-triage decision-support, in English or Bahasa Indonesia. The agentic AI realization of the Advanced Trauma Life Support primary survey, served from a single AMD Instinct MI300X.

AMD Developer HackathonBuilt by an emergency physician
§01The double meaning

A double meaning was deliberate. ATLS is the global trauma protocol from the American College of Surgeons — Advanced Trauma Life Support. This project — Agentic Trauma Life Support — is the agentic AI realization of that protocol’s primary survey.

§02Why this exists

Trauma is the leading cause of death between the ages of 1 and 44 worldwide. The Advanced Trauma Life Support primary survey — a structured walk through Airway, Breathing, Circulation, Disability, Exposure — works because someone trained walks it.

In rural emergency rooms and resource-limited casualty departments, that someone trained is often a junior clinician, a nurse on a phone link, or a referral chain that takes hours. ATLS — the project — is a structured, citation-backed triage assistant that walks the protocol on a chest X-ray and a brief clinical vignette, and produces a Pydantic-validated primary-survey JSON plus an SBAR handoff in the local language, in seconds.

§03Why a single MI300X

The 192 GB-class shortlist, drawn to scale.

Qwen2.5-VL-72B in full BF16 occupies roughly 145 GB of weights — the dashed line. Anything to the left of it does not fit on a single device.

1 × NVIDIA H10080GB
does not fit
1 × NVIDIA H200141GB
does not fit
2 × NVIDIA H100 NVLink160GB
fits · tensor-parallel≈ $4–5 / hr
1 × NVIDIA B200192GB
fits≈ $5–8 / hr
1 × AMD MI300X192GB
← chosen
fits≈ $1.99 / hr

The only sub-$2/hr option that fits 72B in BF16 on a single GPU.

§04The agentic pipeline

Three agents on the same model, on the same vLLM server.

01

Drafter

  • Reads X-ray, vitals, retrieved excerpts
  • Writes strict TriageOutput JSON
  • vLLM guided JSON · no prose
02

Verifier

  • Re-sees the original X-ray
  • Compares against the draft
  • Returns notes + path-walker patches
03

Renderer

  • Validated JSON in
  • SBAR markdown out
  • English or Bahasa Indonesia

Both model calls hit the same vLLM server. No model router. No second GPU.

§05Real benchmarks

Numbers from a real chest X-ray, against the live vLLM server.

TTFT (median)
1981ms

Single image · short prompt

Throughput
21.5tok / s

Single image · 3 k-token retrieved context

Peak VRAM
183.95/ 191.69GiB

Concurrent batch of four · 96 % of budget

n = 5 per scenario · real chest X-ray (case_01_tension_ptx) · vLLM 0.17.1 ROCm · single MI300X

§06Six demo cases

Five English. One Indonesian. One that has to refuse to invent.

case_01_tension_ptxEN

Tension pneumothorax

Drafter calls the right action — drama case

case_02_massive_htxEN

Massive hemothorax + shock

Verifier catches drafter laterality error

case_03_flail_chestEN

Flail chest

Verifier downgrades drafter over-call

case_04_pulm_contusionEN

Bilateral pulmonary contusion

Multi-panel image · ICU disposition

case_05_normal_polytraumaEN

Normal CXR · abdominal injury

The credibility test — model declined to invent

case_06_pediatric_idID

Pediatric blunt thoracic trauma

Multilingual rendering · Bahasa Indonesia

Imaging unremarkable. Recommend FAST and CT abdomen.

Case 05 · Normal CXR with hypotension · Model declined to invent

§07What this ships with