Text Generation
Transformers
Safetensors
English
Chinese
function-calling
tool-use
crypto
blockchain
solana
ethereum
on-device
privacy
edge-ai
mobile
wallet
standard-protocol
Instructions to use DMindAI/DMind-3-nano with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use DMindAI/DMind-3-nano with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="DMindAI/DMind-3-nano")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("DMindAI/DMind-3-nano", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use DMindAI/DMind-3-nano with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "DMindAI/DMind-3-nano" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DMindAI/DMind-3-nano", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/DMindAI/DMind-3-nano
- SGLang
How to use DMindAI/DMind-3-nano with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "DMindAI/DMind-3-nano" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DMindAI/DMind-3-nano", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "DMindAI/DMind-3-nano" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "DMindAI/DMind-3-nano", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use DMindAI/DMind-3-nano with Docker Model Runner:
docker model run hf.co/DMindAI/DMind-3-nano
| # FunctionGemma SFT LoRA quickstart | |
| # Environment | |
| export CUDA_VISIBLE_DEVICES=0 # e.g. "0,1,2,3" for multi-GPU | |
| export TOKENIZERS_PARALLELISM=false | |
| # Model path (update to your local model location) | |
| MODEL_PATH="/path/to/your/functiongemma-270m-it" | |
| # Dataset path | |
| DATASET_PATH="./data/training_data.json" | |
| # Output directory | |
| OUTPUT_DIR="./runs" | |
| # Run name | |
| RUN_NAME="functiongemma-lora-$(date +%Y%m%d_%H%M%S)" | |
| echo "========================================" | |
| echo "FunctionGemma SFT LoRA training" | |
| echo "========================================" | |
| echo "Model: $MODEL_PATH" | |
| echo "Dataset: $DATASET_PATH" | |
| echo "Output: $OUTPUT_DIR/$RUN_NAME" | |
| echo "========================================" | |
| # Option 1: Standard LoRA (recommended for most GPUs) | |
| python -m src.train \ | |
| --model_path "$MODEL_PATH" \ | |
| --dataset_path "$DATASET_PATH" \ | |
| --output_dir "$OUTPUT_DIR" \ | |
| --run_name "$RUN_NAME" \ | |
| --lora_r 16 \ | |
| --lora_alpha 32 \ | |
| --lora_dropout 0.05 \ | |
| --num_train_epochs 3 \ | |
| --per_device_train_batch_size 4 \ | |
| --gradient_accumulation_steps 4 \ | |
| --learning_rate 5e-5 \ | |
| --warmup_ratio 0.1 \ | |
| --max_seq_length 2048 \ | |
| --bf16 \ | |
| --logging_steps 10 \ | |
| --save_steps 100 \ | |
| --eval_steps 100 \ | |
| --gradient_checkpointing | |
| # Option 2: QLoRA (for smaller GPUs, uncomment to use) | |
| # python -m src.train \ | |
| # --model_path "$MODEL_PATH" \ | |
| # --dataset_path "$DATASET_PATH" \ | |
| # --output_dir "$OUTPUT_DIR" \ | |
| # --run_name "$RUN_NAME-qlora" \ | |
| # --lora_r 16 \ | |
| # --lora_alpha 32 \ | |
| # --lora_dropout 0.05 \ | |
| # --num_train_epochs 3 \ | |
| # --per_device_train_batch_size 8 \ | |
| # --gradient_accumulation_steps 2 \ | |
| # --learning_rate 2e-4 \ | |
| # --warmup_ratio 0.1 \ | |
| # --max_seq_length 2048 \ | |
| # --use_4bit \ | |
| # --logging_steps 10 \ | |
| # --save_steps 100 \ | |
| # --eval_steps 100 \ | |
| # --gradient_checkpointing | |
| echo "========================================" | |
| echo "Training finished!" | |
| echo "Model saved to: $OUTPUT_DIR/$RUN_NAME" | |
| echo "========================================" | |