TRIAGE-4B-P12-SFT-RL
SFT+RL checkpoints released as part of TRIAGE, a framework that casts clinical risk prediction over irregularly sampled medical time series (ISMTS) as a reasoning task.
The model was presented in the paper TRIAGE: Dialectical Reasoning for Explainable Risk Prediction on Irregularly Sampled Medical Time Series with LLMs.
TRIAGE trains an LLM to generate dialectical reasoning over competing clinical outcomes by eliciting outcome-specific rationales. This approach mitigates risk polarization and enables a single LLM to yield continuous risk scores grounded in explicit clinical reasoning. This specific model is applied to the P12 dataset.
Code, data, & training pipeline: https://github.com/HyeongWon-Jang/TRIAGE
Model Description
Each split in this repository is RL-trained on top of an SFT warm-start and placed in its own split_N/ subfolder. Per-split RL checkpoints were selected by Validation AUPRC over the RL trajectory.
The SFT warm-start used to initialize this RL run lives on the rl_init branch of this same repo.
Quick start
from transformers import AutoModelForCausalLM, AutoTokenizer
split = "split_1" # one of split_1 ... split_5
repo = "Hyeongwon/TRIAGE-4B-P12-SFT-RL"
tokenizer = AutoTokenizer.from_pretrained(repo, subfolder=split)
model = AutoModelForCausalLM.from_pretrained(repo, subfolder=split, device_map="auto")
The model expects a task-specific input/output template; for the full inference pipeline, see the linked GitHub repository.
Data
- Raw: PhysioNet Mortality Prediction Challenge 2012
- Processed (Raindrop, CC BY 4.0): figshare DOI
Further preprocessing and split-construction details are in the linked GitHub repository.
License
Model checkpoints are released under CC BY-NC 4.0 (non-commercial). Datasets remain under their respective licenses.
Model tree for Hyeongwon/TRIAGE-4B-P12-SFT-RL
Base model
Qwen/Qwen3-4B-Base