Instructions to use nextai-team/Moe-3x7b-QA-Code-Inst with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use nextai-team/Moe-3x7b-QA-Code-Inst with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="nextai-team/Moe-3x7b-QA-Code-Inst")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("nextai-team/Moe-3x7b-QA-Code-Inst")
model = AutoModelForCausalLM.from_pretrained("nextai-team/Moe-3x7b-QA-Code-Inst")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use nextai-team/Moe-3x7b-QA-Code-Inst with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "nextai-team/Moe-3x7b-QA-Code-Inst"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nextai-team/Moe-3x7b-QA-Code-Inst",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/nextai-team/Moe-3x7b-QA-Code-Inst

SGLang

How to use nextai-team/Moe-3x7b-QA-Code-Inst with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "nextai-team/Moe-3x7b-QA-Code-Inst" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nextai-team/Moe-3x7b-QA-Code-Inst",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "nextai-team/Moe-3x7b-QA-Code-Inst" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "nextai-team/Moe-3x7b-QA-Code-Inst",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use nextai-team/Moe-3x7b-QA-Code-Inst with Docker Model Runner:
```
docker model run hf.co/nextai-team/Moe-3x7b-QA-Code-Inst
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Model Details

Model Name: Moe-3x7b-QA-Code-Inst Publisher: nextai-team Model Type: Question Answering & Code Generation Architecture: Mixture of Experts (MoE) Model Size: 3x7 billion parameters

Overview

Moe-3x7b-QA-Code-Inst is an advanced AI model designed by the nextai-team for the purpose of enhancing question answering and code generation capabilities. Building upon the foundation of its predecessor, Moe-2x7b-QA-Code, this iteration introduces refined mechanisms and expanded training datasets to deliver more precise and contextually relevant responses.

Intended Use

This model is intended for developers, data scientists, and researchers seeking to integrate sophisticated natural language understanding and code generation functionalities into their applications. Ideal use cases include but are not limited to:

Automated coding assistance Technical support bots Educational tools for learning programming Enhancing code review processes

Model Architecture Moe-3x7b-QA-Code-Inst employs a Mixture of Experts (MoE) architecture, which allows it to efficiently manage its vast number of parameters for specialized tasks. This architecture facilitates the model's ability to discern subtle nuances in programming languages and natural language queries, leading to more accurate code generation and question answering performance.

Training Data The model has been trained on a diverse and extensive corpus comprising technical documentation, open-source code repositories, Stack Overflow questions and answers, and other programming-related texts. Special attention has been given to ensure a wide range of programming languages and frameworks are represented in the training data to enhance the model's versatility.

Performance Moe-3x7b-QA-Code-Inst demonstrates significant improvements in accuracy and relevance over its predecessor, particularly in complex coding scenarios and detailed technical queries. Benchmarks and performance metrics can be provided upon request.

Limitations and Biases

While Moe-3x7b-QA-Code-Inst represents a leap forward in AI-assisted coding and technical Q&A, it is not without limitations. The model may exhibit biases present in its training data, and its performance can vary based on the specificity and context of the input queries. Users are encouraged to critically assess the model's output and consider it as one of several tools in the decision-making process.

Ethical Considerations

We are committed to ethical AI development and urge users to employ Moe-3x7b-QA-Code-Inst responsibly. This includes but is not limited to avoiding the generation of harmful or unsafe code, respecting copyright and intellectual property rights, and being mindful of privacy concerns when inputting sensitive information into the model.

Usage Instructions

For detailed instructions on how to integrate and utilize Moe-3x7b-QA-Code-Inst in your projects, please refer to our GitHub repository and Hugging Face documentation.

Citation If you use Moe-3x7b-QA-Code-Inst in your research or application, please cite it as follows:

@misc{nextai2024moe3x7b, title={Moe-3x7b-QA-Code-Inst: Enhancing Question Answering and Code Generation with Mixture of Experts}, author={NextAI Team}, year={2024}, publisher={Hugging Face} }

Downloads last month: 90

Safetensors

Model size

19B params

Tensor type

F16

Model tree for nextai-team/Moe-3x7b-QA-Code-Inst

Quantizations

1 model