--- title: UncGPT — WebGPU caregiving chat emoji: 🦉 colorFrom: indigo colorTo: green sdk: static pinned: false --- # UncGPT — WebGPU caregiving chat In-browser WebGPU chat on the UncGPT caregiving small-model family. **Runs on your own GPU via WebGPU — no server inference, no ZeroGPU.** Tokens stream as they're generated; per-step status (download → load → prefill → decode tok/s) shown inline. Companion to the [NeurIPS 2026 Competition Track proposal](https://github.com/Reza2kn/uncgpt-2026-neurips). ## Models in the picker - **LFM2-small full-pass (step 12,399)** — ready. 69.8M params, 10 gated short-conv blocks + 6 GQA blocks, BitNet 1.58b. Trained on the live quality-clean 3,846-conversation 11-language corpus. Final logged LM loss 2.26. Decode bench: ~430–500 tok/s on commodity hardware. - **UncGPT v1 langtag full-pass** — re-training in progress. Wires into the same picker once weights + WebGPU BitNet NPZ land on the v1 fullpass HF repo. ## What's in the bundle - WebGPU runtime (Mamba-2 / GQA / BitNet 1.58b / MoE kernels) from the `uncgpt_browser_webgpu` source - `sentencepiece-js` (raw `.model` byte-for-byte, matches Python SP segmentation exactly) - JS sampling loop: temperature, top-p, repetition penalty, no-repeat n-gram, stop-on-``/`` - Chat UI with conversation history (last 3 turns fed back into the prompt) - Per-stage progress bar (manifest → weights → tokenizer → GPU upload) ## Browser support Latest **Chrome / Edge (desktop)**, Safari Tech Preview, or Firefox Nightly with WebGPU enabled. The header shows `WebGPU available ✓` when ready. ## License Code: Apache-2.0. Weights and tokenizers: see the linked HF model repos.