You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

MedGemma-4B Full LoRA (Layers 0-33) β€” Multi-task n=12K

LoRA adapter for google/medgemma-4b-it, released as part of "Mechanistically Guided LoRA Improves Paraphrase Consistency in Medical Vision-Language Models" (Sadanadan & Behzadan, CHIL 2026).

This is the full arm of the paper: rank-16 adapters applied to all 34 layers of the language model. It serves as the high-capacity contrast point to the targeted-layer (L15-19) adapter, isolating the question of whether mechanistically motivated layer selection matters versus distributing adaptation across the whole stack.

This release corresponds to the multi-task n=12K scale-up of the n=500 binary checkpoint reported in the submitted CHIL paper. It uses a sequence-level cross-entropy + symmetric KL loss compatible with all MIMIC-CXR question types.

Training

Setting Value
Base model google/medgemma-4b-it
Adapter rank (r) 16
alpha 32
Dropout 0.05
Learning rate 2e-4
Effective batch size 8 (batch 1, grad-accum 8)
Epochs 3
Target layers 0-33 (all)
Target modules Q, K, V, O attention projections + gate, up, down MLP projections
Training data MIMIC-CXR train split, all question types, ~2,865 unique questions Γ— 3 epochs of random paraphrase sampling β‰ˆ 8,600 paraphrase pairs
Loss Sequence-level cross-entropy on first answer token + symmetric KL divergence between paraphrase predictions
Trainable parameters 29.8M (0.69% of base)

Usage

from transformers import AutoProcessor, AutoModelForImageTextToText
from peft import PeftModel
import torch

base = AutoModelForImageTextToText.from_pretrained(
    "google/medgemma-4b-it",
    dtype=torch.bfloat16,
    device_map="cuda",
)
model = PeftModel.from_pretrained(base, "saillab/medgemma-4b-full-lora-mimic-mt-12k")
processor = AutoProcessor.from_pretrained("saillab/medgemma-4b-full-lora-mimic-mt-12k")

Intended use

Research on medical-VLM paraphrase robustness and LoRA-based fine-tuning. Not for clinical use. The CHIL paper documents that this fully fine-tuned variant achieves the lowest flip rate but at the cost of higher text-only agreement β€” the model relies more on language priors than image evidence relative to the targeted-layer variant.

Citation (primary β€” CHIL 2026)

@inproceedings{sadanadan2026mechanistic,
  title     = {Mechanistically Guided LoRA Improves Paraphrase Consistency in Medical Vision-Language Models},
  author    = {Sadanadan, Binesh and Behzadan, Vahid},
  booktitle = {Conference on Health, Inference, and Learning (CHIL)},
  year      = {2026}
}

Companion evaluation work

@misc{sadanadan2026heatmap,
  title  = {Attention Without Grounding: Causal Evaluation of Visual Explanations in Medical Vision-Language Models},
  author = {Sadanadan, Binesh and Behzadan, Vahid},
  year   = {2026},
  note   = {Pre-print, SAIL Lab, University of New Haven}
}

License

Distributed under the Gemma Terms of Use, inheriting the licensing terms of the base model.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for saillab/medgemma-4b-full-lora-mimic-mt-12k

Adapter
(99)
this model

Collection including saillab/medgemma-4b-full-lora-mimic-mt-12k