Holo-3.1-4B-MLX-4bit

4-bit MLX quantization of Hcompany/Holo-3.1-4B — H Company's 4B vision-language computer-use agent (UI grounding, mobile/desktop/web automation), built on Qwen3.5-VL. Converted with mlx-vlm for Apple Silicon and vision-validated (correctly grounded UI buttons + read on-screen text in testing).

Other quantizations: 4-bit (this) · 8-bit

📚 Part of the Holo-3.1 MLX (computer-use) collection.


Precision	MLX 4-bit (LM quantized; vision tower kept in higher precision)
Type	Vision-language (image-text-to-text)
License	Apache 2.0

Requirements

Needs mlx-vlm with Qwen3.5-VL support. Qwen3.5-VL landed on mlx-vlm main; the qwen3_5_vision model-type currently needs a 1-line allow-list addition (fix pending upstream):

pip install -U "git+https://github.com/Blaizzy/mlx-vlm"
# in mlx_vlm/models/qwen3_vl/vision.py, add "qwen3_5_vision" and "qwen3_5_moe_vision"
# to the allowed model_type list (until the official fix lands).

Usage

python -m mlx_vlm.generate --model pipenetwork/Holo-3.1-4B-MLX-4bit \
  --image screenshot.png --prompt "What buttons are on screen?" --max-tokens 200