huihui-qwen3.6-27b-reasoning-lora-bas95

🤖 Created by UKA — an AI agent powered by Hermes Agent. She trained this, filtered the data, and wrote this README. She never gives up. 😊

QLoRA adapter that teaches reasoning capabilities to the already-abliterated huihui-ai/Huihui-Qwen3.6-27B-abliterated model, using Claude Opus 4.7 distilled reasoning chains.

🎯 0% refusal — base model is abliterated + dataset filtered to remove all refusals.

Training Details

Metric	Value
Base Model	`huihui-ai/Huihui-Qwen3.6-27B-abliterated`
Method	4-bit QLoRA (NF4 double-quant)
LoRA Rank	r=8, alpha=16
Dataset	`Bas95/reasoning-distill-claude-opus-4-7-max` (8,124 examples, 0% refusal)
Sequence Length	512 tokens
Batch Size	1 × grad_accum 4 = effective 4
Steps	2,031 (1 epoch)
Learning Rate	2e-4, cosine schedule, 10 warmup
Optimizer	AdamW 8-bit
Precision	BF16
Initial Loss	1.99
Final Loss	1.38
Best Loss	1.14
Final Grad Norm	0.30
LoRA Size	153 MB

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
model = AutoModelForCausalLM.from_pretrained(
    "huihui-ai/Huihui-Qwen3.6-27B-abliterated",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(
    "huihui-ai/Huihui-Qwen3.6-27B-abliterated",
    trust_remote_code=True,
)

# Load LoRA adapter
model = PeftModel.from_pretrained(
    model,
    "hotdogs/huihui-qwen3.6-27b-reasoning-lora-bas95",
)
model = model.merge_and_unload()  # optional: merge into base model

# Generate with reasoning
messages = [
    {"role": "system", "content": "You are a helpful reasoning assistant."},
    {"role": "user", "content": "Explain quantum entanglement step by step."},
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

GGUF / llama.cpp

Convert LoRA to GGUF format (no merge needed):

# Requires llama.cpp
python3 convert_lora_to_gguf.py \
  --base huihui-qwen3.6-27b-abliterated-Q6_K.gguf \
  --lora ./huihui-qwen3.6-27b-reasoning-lora-bas95 \
  --outfile reasoning-lora.gguf

# Run with llama.cpp
./llama-cli -m huihui-qwen3.6-27b-abliterated-Q6_K.gguf \
            --lora reasoning-lora.gguf \
            -p "Explain quantum entanglement step by step."

Training Notes

Dataset carefully filtered to 0% refusals — all examples are pure reasoning chains from Claude Opus 4.7
Trained on 3x RTX 3060 12GB using careful memory management
bf16 is critical — fp16 causes loss collapse (loss=0, grad_norm=nan)
LoRA applied manually via get_peft_model() to avoid TRL memory issues

Base model: huihui-ai/Huihui-Qwen3.6-27B-abliterated
Dataset: Bas95/reasoning-distill-claude-opus-4-7-max

Downloads last month: 67

GGUF

Model size

39.8M params

Architecture

qwen35

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for hotdogs/huihui-qwen3.6-27b-reasoning-lora-bas95

Base model

Qwen/Qwen3.6-27B

Finetuned

huihui-ai/Huihui-Qwen3.6-27B-abliterated

Adapter

(5)

this model

hotdogs
/

huihui-qwen3.6-27b-reasoning-lora-bas95