Holo-3.1-4B-MLX-4bit

4-bit MLX quantization of Hcompany/Holo-3.1-4B — H Company's 4B vision-language computer-use agent (UI grounding, mobile/desktop/web automation), built on Qwen3.5-VL. Converted with mlx-vlm for Apple Silicon and vision-validated (correctly grounded UI buttons + read on-screen text in testing).

Other quantizations: 4-bit (this) · 8-bit

📚 Part of the Holo-3.1 MLX (computer-use) collection.

Precision MLX 4-bit (LM quantized; vision tower kept in higher precision)
Type Vision-language (image-text-to-text)
License Apache 2.0

Requirements

Needs mlx-vlm with Qwen3.5-VL support. Qwen3.5-VL landed on mlx-vlm main; the qwen3_5_vision model-type currently needs a 1-line allow-list addition (fix pending upstream):

pip install -U "git+https://github.com/Blaizzy/mlx-vlm"
# in mlx_vlm/models/qwen3_vl/vision.py, add "qwen3_5_vision" and "qwen3_5_moe_vision"
# to the allowed model_type list (until the official fix lands).

Usage

python -m mlx_vlm.generate --model pipenetwork/Holo-3.1-4B-MLX-4bit \
  --image screenshot.png --prompt "What buttons are on screen?" --max-tokens 200

Conversion

python -m mlx_vlm.convert --hf-path Hcompany/Holo-3.1-4B --mlx-path <out> -q --q-bits 4 --q-group-size 64

Converted by pipenetwork with mlx-vlm. Original model Apache-2.0 by H Company; not affiliated.

Downloads last month
14
Safetensors
Model size
1.0B params
Tensor type
F32
·
U32
·
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for pipenetwork/Holo-3.1-4B-MLX-4bit

Quantized
(8)
this model

Space using pipenetwork/Holo-3.1-4B-MLX-4bit 1

Collection including pipenetwork/Holo-3.1-4B-MLX-4bit