vinhnx90's picture
Update README.md
73da0dc verified
---
base_model:
- unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit
- canopylabs/orpheus-3b-0.1-ft
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- gguf
- tts
license: apache-2.0
language:
- en
datasets:
- Jinsaryko/Ceylia
---
# Introduction
VT-Orpheus-3B-TTS-lora-adapter is a Lora adapter fine-tuned from [Orpheus-TTS](https://github.com/canopyai/Orpheus-TTS).
Dataset is from <https://huggingface.co/datasets/Jinsaryko/Ceylia>.
# Sample Audio
Check my [setup guide](https://huggingface.co/vinhnx90/VT-Orpheus-3B-TTS-Ceylia-Q4KM-GGUFF#running-locally) for running the local Orpheus model with my Lora adapter.
```python
python gguf_orpheus.py --text "Seriously? <giggle> That's the cutest thing I've ever heard ! " --voice ceylia
```
<audio controls><source src="https://huggingface.co/vinhnx90/VT-Orpheus-3B-TTS-Ceylia-Q4KM-GGUFF/resolve/main/output.wav" type="audio/wav"></audio>
```python
python gguf_orpheus.py --text "Hi! I'm Ceylia. <laugh> This is so exciting! <giggle>" --voice ceylia
```
<audio controls><source src="https://huggingface.co/vinhnx90/VT-Orpheus-3B-TTS-Ceylia-Q4KM-GGUFF/resolve/main/ceylia_20250409_010117.wav" type="audio/wav"></audio>
```python
python gguf_orpheus.py --text "Morning! <giggle> I finally finished that project last night. It took forever, but the results look amazing. <yawn> Sorry, still a bit tired from staying up so late." --voice ceylia
```
<audio controls><source src="https://huggingface.co/vinhnx90/VT-Orpheus-3B-TTS-Ceylia-Q4KM-GGUFF/resolve/main/ceylia_20250409_013043.wav" type="audio/wav"></audio>
# Running Locally
This section provides a step-by-step guide to running the `VT-Orpheus-3B-TTS-Ceylia.Q4_K_M.gguf` model locally on your machine. There are two main methods to run this model:
## Method 1: Using LM Studio (Recommended for beginners)
### Prerequisites
1. [LM Studio](https://lmstudio.ai/) installed on your computer
2. Python 3.8+ installed
3. The `VT-Orpheus-3B-TTS-Ceylia.Q4_K_M.gguf` model file
### Setup Steps
1. **Install LM Studio**
- Download and install LM Studio from [lmstudio.ai](https://lmstudio.ai/)
- Launch LM Studio
2. **Load the GGUF model**
- In LM Studio, click "Add Model"
- Select the `VT-Orpheus-3B-TTS-Ceylia.Q4_K_M.gguf` file from your computer
- Once added, click on the model to load it
3. **Start the local server**
- Go to the "Local Server" tab in LM Studio
- Click "Start Server" to launch the local API server (default address is `http://127.0.0.1:1234`)
4. **Clone orpheus-tts-local repository**
```bash
git clone https://github.com/isaiahbjork/orpheus-tts-local.git
cd orpheus-tts-local
```
5. **Install dependencies**
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```
5.1 **Edit gguf_orpheus.py to include new ceylia voice**
Open `gguf_orpheus.py` file in ./orpheus-tts-local directory, find the line of `AVAILABLE_VOICES` and `DEFAULT_VOICE` and edit to include ceylia voice, default is `tara`.
```python
# Available voices based on the Orpheus-TTS repository
AVAILABLE_VOICES = ["tara", "leah", "jess", "leo", "dan", "mia", "zac", "zoe", "ceylia"]
DEFAULT_VOICE = "ceylia"
```
Save the file `gguf_orpheus.py`.
6. **Run the model**
```bash
python gguf_orpheus.py --text "Hi! I'm Ceylia. <laugh> This is so exciting! <giggle>" --voice ceylia --output output.wav
```
### Available Parameters
- `--text`: The text to convert to speech (required)
- `--voice`: The voice to use (default is "tara", but use "ceylia" for this model)
- `--output`: Output WAV file path (default: auto-generated filename)
- `--temperature`: Temperature for generation (default: 0.6)
- `--top_p`: Top-p sampling parameter (default: 0.9)
- `--repetition_penalty`: Repetition penalty (default: 1.1)
- `--backend`: Specify the backend (default: "lmstudio", also supports "ollama")
## Method 2: Using llama.cpp directly
### Prerequisites
1. [llama.cpp](https://github.com/ggerganov/llama.cpp) installed and built on your system
2. The [VT-Orpheus-3B-TTS-Ceylia.Q4_K_M.gguf](https://huggingface.co/vinhnx90/VT-Orpheus-3B-TTS-Ceylia-Q4KM-GGUFF/blob/main/VT-Orpheus-3B-TTS-Ceylia.Q4_K_M.gguf) model file
### Setup Steps
1. **Clone and build llama.cpp**
```bash
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build
cmake --build build --config Release
```
2. **Start the server**
```bash
./llama-server -m /path/to/VT-Orpheus-3B-TTS-Ceylia.Q4_K_M.gguf --port 8080
```
3. **Clone orpheus-tts-local repository**
```bash
git clone https://github.com/isaiahbjork/orpheus-tts-local.git
cd orpheus-tts-local
```
4. **Install dependencies**
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
```
5. **Run the model with custom API URL**
```bash
python gguf_orpheus.py --text "Hi! I'm Ceylia. <laugh> Let's play! <sniffle> This is so exciting! <giggle>" --voice ceylia --output output.wav --api_url http://localhost:8080/v1
```
## Emotion Tags
You can add emotion to the speech by including the following tags in your text:
- `<giggle>`
- `<laugh>`
- `<chuckle>`
- `<sigh>`
- `<cough>`
- `<sniffle>`
- `<groan>`
- `<yawn>`
- `<gasp>`
Example:
```bash
python gguf_orpheus.py --text "Hi! I'm Ceylia. <laugh> This is so exciting! <giggle>" --voice ceylia
```
## Troubleshooting
1. **Error connecting to server**: Make sure LM Studio's server is running or llama.cpp server is running on the correct port
2. **Low-quality audio**: Try adjusting the temperature (higher = more variance) or repetition_penalty (>1.1 recommended)
3. **Slow generation**: Reduce model precision or run on a more powerful GPU if available
# Uploaded model
- **Developed by:** vinhnx90
- **License:** apache-2.0
- **Finetuned from model :** unsloth/orpheus-3b-0.1-ft-unsloth-bnb-4bit
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)