VOOZH about

URL: https://huggingface.co/madrisight/MadriMed-VL-2B-enc

โ‡ฑ madrisight/MadriMed-VL-2B-enc ยท Hugging Face


MadriMed-VL-2B-enc

A 2B-parameter medical vision-language model, trained for medical image understanding, radiology report assistance, clinical visual question answering, and medical text reasoning.

This release introduces Dynamic LoRA Scaling, a lightweight calibration technique that reduces adapter dominance while preserving the medical knowledge learned during fine-tuning. The objective is to improve reliability, reduce hallucinations, and mitigate common diagnostic confusion patterns observed in earlier releases of madrisight/MadriMed-VL-2B.

Important: Use bfloat16 for Inference

This model was trained and calibrated using bfloat16 (BF16) precision for best performance and reproducibility on Pytorch MPS )

๐Ÿš€ Quick Start

Installation

pip install transformers torch Pillow

Example image

๐Ÿ‘ MM-1-a

Run the model directly

import torch
import re
from transformers import AutoProcessor, Qwen3VLForConditionalGeneration
from PIL import Image

BASE_MODEL_ID = "madrisight/MadriMed-VL-2B-enc"

model = Qwen3VLForConditionalGeneration.from_pretrained(
 BASE_MODEL_ID,
 device_map="cuda",
 trust_remote_code=True,
)

model.eval()

processor = AutoProcessor.from_pretrained(BASE_MODEL_ID, trust_remote_code=True)

def load_direct_image(path: str) -> Image.Image:
 with Image.open(path) as raw:
 img = raw.convert("RGB")
 return img


# 5. Formulate the Query
prompt = """Choose the correct option for the question

Instructions:
- Analyze ONLY the provided image.
- Do NOT use external medical knowledge.
- Briefly explain the visual evidence relevant to the question.

Question: 
Examine the mammogram image shown above. Which of the following findings is most evident?

Options
A. Well-circumscribed round mass with benign features
B. Clustered microcalcifications within an area of irregular density
C. Fat-containing lesion consistent with lipoma
D. Diffuse bilateral breast edema
"""

img = load_direct_image("/content/MM-1-a.png")


messages = [
 {
 "role": "system",
 "content": "You are an expert medical AI. You must deeply analyze the question and provide the final answer."
 },
 {
 "role": "user",
 "content": [
 {"type": "image"},
 {"type": "text", "text": prompt}
 ]
 }
]
stop_token_id = processor.tokenizer.convert_tokens_to_ids("<|im_end|>")

with torch.inference_mode():

 text = processor.apply_chat_template(

 messages,
 tokenize=False,
 add_generation_prompt=True
 )

 inputs = processor(
 text=text,
 images=[img],
 return_tensors="pt",
 ).to("cuda")

 generated_ids = model.generate(
 **inputs,
 max_new_tokens=1024, # tight control (prevents drift)
 do_sample=False, # deterministic output
 pad_token_id=processor.tokenizer.pad_token_id,
 eos_token_id=stop_token_id
 )

 output_text = processor.batch_decode(
 generated_ids[:, inputs.input_ids.shape[1]:],
 skip_special_tokens=True
 )[0]


print(output_text.strip())
So, let's analyze the mammogram. The image shows a breast with some irregularities.
Looking at the options: A is about a well-circumscribed round mass, but the image doesn't show a clear mass.
B mentions clustered microcalcifications in an irregular density area. In mammograms, microcalcifications are often seen as small white spots, and irregular density might be a pattern.
C is a fat-containing lesion like a lipoma, but the image doesn't show fat density.
D is bilateral breast edema, which isn't visible here.
So B seems to fit because microcalcifications are a key finding in mammograms, especially when clustered.
</think>

B. Clustered microcalcifications within an area of irregular density

๐Ÿ”ฌ Technical Details

Training Configuration

Parameter Value
Base model Qwen/Qwen3-VL-2B-Thinking
Training data medmax
Fine-tuning type Lora SFT + GPRO
Precision bfloat16
Hardware Single Mac Mini (M4 Pro) with TrlMPS (https://github.com/krrish-v/trlmps)

๐Ÿ™ Acknowledgments


๐Ÿ“„ Citation

If you use this model in research, please cite:

@software{madrimedvl2b,
 title = {MadriMed-VL-2B: A Compact Multimodal Medical Vision-Language Model},
 author = {Madrisight},
 year = {2026},
 url = {https://huggingface.co/madrisight/MadriMed-VL-2B}
}

Disclaimer: This model is provided for research and educational purposes only. It is not FDA-approved, not clinically validated, and must not be used for patient care without expert human oversight. The authors assume no liability for clinical use.

Downloads last month
228
Safetensors
Model size
2B params
Tensor type
F32
ยท

Model tree for madrisight/MadriMed-VL-2B-enc

Finetuned
(19)
this model
Quantizations
1 model

Dataset used to train madrisight/MadriMed-VL-2B-enc