Falcon-E-1.2-3B-Exp-dpo

This is the model card of Falcon-E-1.2-3B-Exp, a ternary (1.58bits) language model trained on SFT agentic, and STEM data using axolotl framework combined with onebitllm library.

The model has been trained starting from axolotl-ai-co/Falcon-E-1.2-3B-Exp-prequantized checkpoint using a DPO stage for 3 epochs.

Usage

The model uses think mode by default, this can be disabled and switched to non-thiking mode. You can use the model with different frameworks such as HF transformers, llama.cpp or mlx-lm

transformers

transformers chat axolotl-ai-co/Falcon-E-1.2-3B-Exp-dpo

llama.cpp

# thinking mode
llama-cli -m axolotl-ai-co/Falcon-E-1.2-3B-Exp-dpo-gguf:TQ2_0 --reasoning-format auto --temp 0.2 -cnv

# non thinking mode
llama-cli -m axolotl-ai-co/Falcon-E-1.2-3B-Exp-dpo-gguf:TQ2_0 --reasoning-format auto --temp 0.2 -cnv --reasoning-budget 0.0

mlx-lm

mlx_lm.chat axolotl-ai-co/Falcon-E-1.2-3B-Exp-dpo --temperature 0.2

Further fine-tuning the model

You can further fine-tune this model, or the base model using their prequantized version. Refer to the axolotl config to get started on fine-tuning these models:

Aknowledgement

Falcon-E-Chat-Exp models are built using Falcon LLM technology from the Technology Innovation Institute.

Downloads last month
218
Safetensors
Model size
0.9B params
Tensor type
BF16
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support