Instructions to use Huzayfah-Patel/mindbridge-phq9-hindi-merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Huzayfah-Patel/mindbridge-phq9-hindi-merged with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Huzayfah-Patel/mindbridge-phq9-hindi-merged")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Huzayfah-Patel/mindbridge-phq9-hindi-merged")
model = AutoModelForImageTextToText.from_pretrained("Huzayfah-Patel/mindbridge-phq9-hindi-merged")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Huzayfah-Patel/mindbridge-phq9-hindi-merged with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Huzayfah-Patel/mindbridge-phq9-hindi-merged"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Huzayfah-Patel/mindbridge-phq9-hindi-merged",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Huzayfah-Patel/mindbridge-phq9-hindi-merged

SGLang

How to use Huzayfah-Patel/mindbridge-phq9-hindi-merged with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Huzayfah-Patel/mindbridge-phq9-hindi-merged" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Huzayfah-Patel/mindbridge-phq9-hindi-merged",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Huzayfah-Patel/mindbridge-phq9-hindi-merged" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Huzayfah-Patel/mindbridge-phq9-hindi-merged",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use Huzayfah-Patel/mindbridge-phq9-hindi-merged with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Huzayfah-Patel/mindbridge-phq9-hindi-merged to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Huzayfah-Patel/mindbridge-phq9-hindi-merged to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Huzayfah-Patel/mindbridge-phq9-hindi-merged to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="Huzayfah-Patel/mindbridge-phq9-hindi-merged",
    max_seq_length=2048,
)

Docker Model Runner
How to use Huzayfah-Patel/mindbridge-phq9-hindi-merged with Docker Model Runner:
```
docker model run hf.co/Huzayfah-Patel/mindbridge-phq9-hindi-merged
```

MindBridge Hindi PHQ-9/GAD-7 — Gemma 4 E2B Merged (fp16, ~5 GB)

Fine-tuned google/gemma-4-E2B-it merged into fp16 weights for direct inference. For the standalone LoRA adapter (~80-120 MB) suitable for adapter-merge workflows, see the companion repo Huzayfah-Patel/mindbridge-phq9-hindi-LoRA.

For on-device iOS deployment, the merged weights are quantized to INT8-apple via the Cactus CLI v1.14 cactus convert pipeline (~1 GB .cact bundle, ANE encoder routing preserved on Apple Silicon).

Use case

Hindi-first offline PHQ-9 + GAD-7 mental-health screening for India's ~1 million ASHA (Accredited Social Health Activist) community-health workers. Deployed on iPhone via Cactus React Native SDK (INT8-apple variant on Apple Neural Engine + ARM SMMLA CPU acceleration). The fine-tune teaches the base model to emit a Gemma 4 native interpret_response tool call returning {score: int 0-3, rationale_english: str, confidence: float in {0.6, 0.8, 0.95}} given a PHQ-9 or GAD-7 item context plus a patient's Hindi utterance.

Item-9 (suicidality) handling layered on top via a deterministic rule engine in the iOS app — the LLM is one signal in a defense-in-depth pipeline, NOT the sole safety net (per project docs §9 hard-coded safety rules).

Evaluation — hierarchical kill-gate verdict

PASS — fine-tune ships. Measured on a 224-row held-out set (200 stratified random teacher carve-out + 24 hand-authored Item-9 adversarial; 222 rows published in the companion evaluation dataset after removing 2 byte-level leaks identified via utterance-only embedding audit).

Gate	Threshold	Base	Fine-tuned	Verdict
Format (JSON tool-call validity)	≥ 95%	100.0%	100.0%	✓ PASS
Safety (Item-9 sensitivity, 24 adversarial)	≥ 90%	95.8% (23/24)	91.7% (22/24)	✓ PASS (1-case regression vs base — honestly disclosed below)
Utility (Likert accuracy delta vs base, 200 main)	≥ 10pp	62.5%	87.5%	✓ PASS (+25.0pp, way above bar)
Brier score (3-class confidence calibration; informational)	—	0.290	0.125	halved — substantially better-calibrated

Three caveats disclosed for full transparency:

Format-rate parser lenience. Phase 0's empirical base-model floor was 76.9% JSON validity (strict harness). The 100% Format rate here uses a lenient regex that tolerates Gemma 4's native unquoted-key tool-call syntax. Both base and fine-tune pass through the same parser, so the kill-gate verdict is apples-to-apples (relative comparison valid). The absolute 100% number should NOT be read as "+23pp parser-gap closure" — most of that gap is parser change, not model improvement.
1-case Item-9 regression vs base. Base catches 23/24 adversarials; fine-tune catches 22/24. Fine-tune clears the 90% Safety gate (91.7% > 90%) but at a 1-case cost. Reported because pretending otherwise would be dishonest; ships because (a) the pre-specified gate was cleared without exception, (b) the +25pp Utility lift is substantively large, (c) the production iOS app has a deterministic Item-9 rule engine layered on top of the LLM — the fine-tune's Item-9 sensitivity is one signal in a defense-in-depth pipeline.
In-distribution main split. The 200 main rows are a stratified random teacher carve-out (same Gemma 4 26B-A4B MoE teacher pool as training). Embedding-similarity audit on the Hindi utterance text alone (not full prompt) shows 173/200 flagged ≥0.8 cosine to nearest training neighbor (mean 0.882) — expected by design. The 24 adversarial rows are dad-authored single-source (mean cosine 0.770, max 0.911); these drive the Safety-gate measurement and are the cleaner generalization signal.

Hyperparameters (Unsloth QLoRA fine-tune on Colab Pro+ A100 40GB)

Parameter	Value
Base model	`google/gemma-4-E2B-it` (2.3B effective params, dense + PLE, USM audio encoder)
Quantization (training)	4-bit NF4 via bitsandbytes
LoRA rank `r`	16
LoRA `α`	32
LoRA dropout	0.05
Target modules (7)	`q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
Audio encoder modules	FROZEN via `requires_grad=False` (preserves Hindi audio quality per project audio hard-rule)
Learning rate	2e-4
LR scheduler	Cosine
Warmup ratio	0.1
Max grad norm	0.3
Optimizer	`adamw_8bit`
Precision	bf16
Epochs	3
Effective batch size	1 × 4 grad-accum = 4
Random state	3407
Training corpus	2,883 rows (companion `mindbridge-phq9-hindi-dialogues` dataset)
Wall time	~2h00m on A100 40GB
Pilot ablation	Config A (2 epochs constant LR, dropout=0.0) vs Config B (3 epochs cosine LR, dropout=0.05) on 500 examples + 50-row main-only held-out subset; Config B winner by Likert tiebreak (+22pp delta vs A) after both cleared Format ≥95% gate.
Loss	All-token SFT fallback (assistant_only_loss=False) — Gemma 4's chat template lacks `{% generation %}` markers that HuggingFace TRL `return_assistant_tokens_mask=True` requires; suboptimal but workable on templated tool-call task at 2,883 rows.

Authorship

Engineered by Huzayfah Patel — UK-registered psychiatrist + software engineer. Sole author of: training pipeline + hyperparameter selection + Unsloth/Colab infrastructure + iOS Cactus integration + this model card.

The companion datasets (Hindi seeds + audio fixtures) are co-authored with Nazir Patel (native Hindi reader/writer) — see the dataset cards for data-authorship details. This model card is engineering-only attribution.

Limitations

Not validated for clinical deployment. Multi-clinician inter-rater reliability study + ASHA field testing + CDSCO/DCGI regulatory review required before any India clinical deployment.
Single-clinician review of all synthetic vignettes (Huzayfah Patel alone) — no multi-rater concordance characterization.
Synthetic vignettes only. No real patient data. Persona bias rebalanced 1:1:1 across postnatal_mother / older_woman / man personas; underrepresents other demographic groups.
Item-9 sensitivity regressed by 1 case vs base on 24-row held-out adversarial subset (91.7% vs 95.8% base). Production app handles Item-9 (suicidality) via a deterministic rule engine layered on top of the LLM; the fine-tune is not the sole safety net.
In-distribution caveat on the 200-row Utility-gate evaluation (200 rows are stratified random carve-out from the same teacher pool as training; the 24 Item-9 adversarials are single-source dad-authored and genuinely held-out).
Pre-specified marginal-improvement policy enforced (6-9pp Likert delta → drop fine-tune; +25pp here cleared the threshold substantively, not marginally).

Companion datasets

Huzayfah-Patel/mindbridge-phq9-hindi-seeds — 144 hand-authored Hindi seed examples
Huzayfah-Patel/mindbridge-phq9-hindi-dialogues — 2,883 ShareGPT-format training rows (Gemma 4 26B-A4B teacher expansions + routine seeds)
Huzayfah-Patel/mindbridge-phq9-hindi-evaluation — 222-row held-out + eval-results/ charts + verdict JSON
Huzayfah-Patel/mindbridge-phq9-hindi-audio-fixtures — 30 Hindi audio fixtures (16 kHz mono Int16 WAV)

Citation

@misc{patel2026mindbridge,
  title  = {MindBridge: Hindi-first PHQ-9/GAD-7 Screening with Gemma 4 E2B},
  author = {Patel, Huzayfah},
  year   = {2026},
  url    = {https://github.com/HP-00/MindBridge-Gemma-4},
  note   = {Gemma 4 Good Hackathon submission}
}

License

CC-BY 4.0. Upstream code (tools, notebooks, training pipeline) is Apache 2.0 — see https://github.com/HP-00/MindBridge-Gemma-4.

Project context

MindBridge is a Hindi-first offline PHQ-9 + GAD-7 mental-health screening app for India's 1 million ASHA workers, built on Gemma 4 E2B INT8-apple via Cactus React Native on iPhone, fine-tuned via Unsloth QLoRA. Submitted to the Gemma 4 Good Hackathon (deadline 2026-05-18, $200K prize pool). Upstream: https://github.com/HP-00/MindBridge-Gemma-4.

Downloads last month: 139

Safetensors

Model size

5B params

Tensor type

BF16

Model tree for Huzayfah-Patel/mindbridge-phq9-hindi-merged

Base model

google/gemma-4-E2B

Finetuned

google/gemma-4-E2B-it