Allows disabling IP-Adapter FaceID entirely for cleaner output quality. Set to True in config.py to re-enable face-locked generation.
🎨 DreamTail
SDXL Image Generation Service for NVIDIA Jetson AGX Orin
DreamTail is a standalone FastAPI service that provides high-quality image generation using Stable Diffusion XL (SDXL), optimized for NVIDIA Jetson AGX Orin. It's designed to be used by multiple clients (Lyra, Vixy, etc.) through a simple REST API with job queue management.
Features
- ✨ SDXL (Stable Diffusion XL) for high-quality 1024x1024 image generation
- 🚀 Jetson-optimized with FP16, attention slicing, and VAE slicing
- 📋 Job queue system with async processing
- 🔄 Multi-client support (Lyra, Vixy, and more)
- 💾 Automatic cleanup (images deleted after 10 days)
- 🔍 Progress tracking via REST API
- 🏥 Health monitoring and statistics
Architecture
┌─────────────┐
│ Clients │ (Lyra, Vixy, etc.)
└──────┬──────┘
│ HTTP/REST
▼
┌──────────────────────┐
│ FastAPI Server │
│ (Port 8765) │
└──────┬───────────────┘
│
┌──────▼─────┬─────────┬──────────┐
│ Job Queue │ SDXL │ Storage │
│ Manager │ Worker │ Manager │
└────────────┴─────────┴──────────┘
│ │ │
│ ┌────▼────┐ │
│ │ GPU │ │
│ │ (Orin) │ │
│ └─────────┘ │
▼ ▼
/app/storage /data/models
Requirements
Hardware
- NVIDIA Jetson AGX Orin (32GB or 64GB recommended)
- ~8-12GB VRAM for SDXL
- ~50GB storage for models and generated images
Software
- Docker with NVIDIA Container Runtime
- JetPack 6.0+ (L4T R36.2.0+)
Installation
1. Download SDXL Models (First Time Only)
# Download models to shared cache (takes ~30 minutes, 13GB download)
export DREAMTAIL_MODELS=/data/models
./scripts/download-models.sh
2. Build Docker Image
# Build on bigorin (AGX Orin)
./scripts/build.sh
3. Run DreamTail
# Start the service
./scripts/run.sh
The service will be available at http://bigorin:8765
API Documentation
POST /generate
Submit an image generation job.
Request:
{
"prompt": "a serene landscape with mountains at sunset",
"client_id": "lyra",
"negative_prompt": "blurry, low quality, distorted",
"params": {
"width": 1024,
"height": 1024,
"num_inference_steps": 30,
"guidance_scale": 7.5,
"seed": 42
}
}
Response (202 Accepted):
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "queued",
"created_at": "2025-11-06T12:00:00Z",
"message": "Job queued. Queue position: 0"
}
GET /status/{job_id}
Check job status and progress.
Response:
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "processing",
"progress": 67,
"created_at": "2025-11-06T12:00:00Z",
"started_at": "2025-11-06T12:00:05Z",
"completed_at": null,
"error": null,
"client_id": "lyra",
"prompt": "a serene landscape..."
}
Status values: queued, processing, completed, failed
GET /result/{job_id}
Download generated image (only when status is completed).
Response: PNG image file
GET /health
Service health check.
Response:
{
"status": "healthy",
"version": "1.0.0",
"model_loaded": true,
"queue_size": 2,
"active_jobs": 1,
"uptime_seconds": 3600.5
}
GET /models
Model configuration information.
Response:
{
"base_model": "stabilityai/stable-diffusion-xl-base-1.0",
"refiner_model": null,
"refiner_enabled": false,
"device": "cuda",
"fp16_enabled": true
}
Usage Examples
Python Client
import requests
import time
# 1. Submit generation job
response = requests.post("http://bigorin:8765/generate", json={
"prompt": "a futuristic city at night with neon lights",
"client_id": "lyra",
"params": {
"width": 1024,
"height": 1024,
"num_inference_steps": 30
}
})
job = response.json()
job_id = job["job_id"]
print(f"Job submitted: {job_id}")
# 2. Poll for completion
while True:
status = requests.get(f"http://bigorin:8765/status/{job_id}").json()
print(f"Status: {status['status']} - Progress: {status['progress']}%")
if status["status"] == "completed":
break
elif status["status"] == "failed":
print(f"Error: {status['error']}")
break
time.sleep(2)
# 3. Download result
image = requests.get(f"http://bigorin:8765/result/{job_id}")
with open(f"{job_id}.png", "wb") as f:
f.write(image.content)
print(f"Image saved: {job_id}.png")
cURL Examples
# Generate image
curl -X POST http://bigorin:8765/generate \
-H "Content-Type: application/json" \
-d '{
"prompt": "a cat wearing a wizard hat",
"client_id": "test"
}'
# Check status
curl http://bigorin:8765/status/YOUR_JOB_ID
# Download result
curl http://bigorin:8765/result/YOUR_JOB_ID -o image.png
# Health check
curl http://bigorin:8765/health
Configuration
Environment Variables
DREAMTAIL_STORAGE- Storage directory (default:/app/storage)DREAMTAIL_MODELS- Models cache directory (default:/app/models)LOG_LEVEL- Logging level (default:INFO)
config.py Settings
Key configuration parameters in config.py:
DEFAULT_STEPS: 30 (20-50 recommended for SDXL)MAX_CONCURRENT_JOBS: 1 (Orin handles 1 SDXL job at a time)IMAGE_RETENTION_DAYS: 10 (auto-cleanup after 10 days)USE_FP16: True (reduces VRAM to ~8GB)ENABLE_ATTENTION_SLICING: True (memory optimization)
Performance
Typical generation time on AGX Orin:
- 1024x1024, 30 steps: ~45-60 seconds
- 1024x1024, 20 steps: ~30-40 seconds (faster, slightly lower quality)
Memory usage:
- SDXL with FP16: ~8GB VRAM
- Peak with attention slicing: ~10GB VRAM
Maintenance
View Logs
docker logs -f dreamtail
Check Storage
curl http://bigorin:8765/storage
Response:
{
"total_images": 42,
"total_size_mb": 156.3,
"storage_path": "/app/storage/images",
"retention_days": 10
}
Manual Cleanup
Images are automatically deleted after 10 days. To manually clean up:
docker exec dreamtail rm -rf /app/storage/images/*
Restart Service
docker restart dreamtail
Stop Service
docker stop dreamtail
Troubleshooting
Model not loading
Symptom: "model_loaded": false in /health
Solutions:
- Check VRAM:
nvidia-smi(need ~10GB free) - Check logs:
docker logs dreamtail - Re-download models:
./scripts/download-models.sh
Out of memory errors
Solutions:
- Reduce concurrent jobs to 1 (default)
- Enable CPU offload: Set
ENABLE_CPU_OFFLOAD=Trueinconfig.py - Reduce image size: Use 768x768 or 512x512
Slow generation
Expected: 45-60 seconds for 1024x1024 @ 30 steps
To speed up:
- Reduce steps to 20-25 (minor quality loss)
- Use smaller resolution (768x768)
- Ensure GPU isn't thermal throttling
Integration with Lyra
DreamTail is designed to be used by Lyra but runs independently (no NATS integration). Lyra can call DreamTail via HTTP:
# In Lyra's code
async def generate_image_for_user(prompt: str):
response = await http_client.post(
"http://bigorin:8765/generate",
json={"prompt": prompt, "client_id": "lyra"}
)
job_id = response.json()["job_id"]
# Poll until complete...
# Return image to user
Project Structure
dreamtail/
├── Dockerfile # Jetson-optimized container
├── requirements.txt # Python dependencies
├── config.py # Configuration
├── main.py # FastAPI app + worker
├── api/
│ ├── models.py # Pydantic schemas
│ └── routes.py # API endpoints
├── worker/
│ ├── generator.py # SDXL pipeline
│ └── queue_manager.py # Job queue
├── storage/
│ ├── file_manager.py # Image storage
│ └── cleanup_task.py # Periodic cleanup
└── scripts/
├── build.sh # Build Docker image
├── run.sh # Run container
└── download-models.sh # Download SDXL
License
This project is part of the Lyra ecosystem. For internal use.
Support
For issues or questions:
- Check logs:
docker logs -f dreamtail - Check health:
curl http://bigorin:8765/health - Review configuration in
config.py
Built with ❤️ for the Lyra project