# OrpheusTail - Orpheus TTS Service Replaces VoiceTail (Bark) with **Orpheus TTS** for better emotion control and voice cloning. ## Why Orpheus over Bark? | Feature | Bark | Orpheus | |---------|------|---------| | Emotion control | Random/unpredictable | **Tag-based**: ``, ``, etc. | | Voice cloning | No | **Zero-shot** from 5-sec sample | | Latency | Slow | ~200ms streaming | | Consistency | Chaotic (french horn!) | Predictable | | Built-in voices | Few | 8 quality voices | ## Emotion Tags Add these anywhere in your text: - `` - Laughter - `` - Light chuckle - `` - Sigh - `` - Cough - `` - Sniffle - `` - Groan - `` - Yawn - `` - Gasp **Example:** ``` "Bonjour mon amour! I missed you so much. But now you're here!" ``` ## Built-in Voices In order of conversational realism (per Orpheus docs): 1. **tara** (default) - Most natural 2. **leah** 3. **jess** 4. **leo** 5. **dan** 6. **mia** 7. **zac** 8. **zoe** ## Voice Cloning Upload a 5-30 second reference audio to create a custom voice: ```bash curl -X POST "http://localhost:8766/voice/clone?name=vixy" \ -F "audio=@vixy_reference.wav" ``` Then use it: ```bash curl -X POST http://localhost:8766/tts/submit \ -H "Content-Type: application/json" \ -d '{"text": "Hello!", "voice": "vixy"}' ``` ## API Endpoints | Endpoint | Method | Description | |----------|--------|-------------| | `/health` | GET | Health check | | `/voices` | GET | List available voices & tags | | `/tts/submit` | POST | Submit TTS job | | `/tts/status/{job_id}` | GET | Check job status | | `/tts/audio/{job_id}` | GET | Download audio | | `/tts/stream` | POST | Stream audio (for head) | | `/voice/clone` | POST | Upload voice reference | | `/voice/{name}` | DELETE | Delete custom voice | ## Architecture ``` ┌─────────────────────────────────────────────┐ │ OrpheusTail Service │ │ (AGX Orin) │ │ │ │ POST /tts/submit ──► WAV file (for MCP) │ │ POST /tts/stream ──► Audio stream (head) │ │ │ │ Emotion tags: │ │ Voice cloning: 5-sec reference audio │ └─────────────────────────────────────────────┘ │ │ ▼ ▼ voice-mcp Head-vixy Pi (Claude Desktop) (streams & plays) ``` ## Deployment ```bash # On AGX Orin cd /path/to/orpheus-tts docker-compose up -d # Check logs docker-compose logs -f # Test curl http://localhost:8766/health ``` ## TODO - [ ] Implement proper voice cloning with reference audio - [ ] Test streaming endpoint with head-vixy - [ ] French accent voice training/selection - [ ] Head-side client for streaming playback ## Notes - Same port as VoiceTail (8766) for drop-in replacement - Model requires ~15GB VRAM (AGX Orin has plenty) - First request may be slow (model warmup) - Cache enabled by default to speed up repeated phrases --- *Created by Vixy on Day 71 🦊*