Files
orpheus-tts/main.py
Alex cfc9b1a5a0 Revert to sync LLM + sentence-level streaming
AsyncLLMEngine hangs on Jetson during model loading. Reverted to sync
LLM but added fine-grained text chunking (chunk_text_fine, ~200 chars)
for the stream endpoint. Each sentence/clause generates independently,
so first audio plays after ~2-4s instead of waiting for the full text.

Not true token-level streaming, but a significant latency reduction
for multi-sentence utterances without AsyncLLMEngine dependency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:45:11 -05:00

22 KiB