# 🎨 DreamTail **SDXL Image Generation Service for NVIDIA Jetson AGX Orin** DreamTail is a standalone FastAPI service that provides high-quality image generation using Stable Diffusion XL (SDXL), optimized for NVIDIA Jetson AGX Orin. It's designed to be used by multiple clients (Lyra, Vixy, etc.) through a simple REST API with job queue management. ## Features - ✨ **SDXL (Stable Diffusion XL)** for high-quality 1024x1024 image generation - 🚀 **Jetson-optimized** with FP16, attention slicing, and VAE slicing - 📋 **Job queue system** with async processing - 🔄 **Multi-client support** (Lyra, Vixy, and more) - 💾 **Automatic cleanup** (images deleted after 10 days) - 🔍 **Progress tracking** via REST API - 🏥 **Health monitoring** and statistics ## Architecture ``` ┌─────────────┐ │ Clients │ (Lyra, Vixy, etc.) └──────┬──────┘ │ HTTP/REST ▼ ┌──────────────────────┐ │ FastAPI Server │ │ (Port 8765) │ └──────┬───────────────┘ │ ┌──────▼─────┬─────────┬──────────┐ │ Job Queue │ SDXL │ Storage │ │ Manager │ Worker │ Manager │ └────────────┴─────────┴──────────┘ │ │ │ │ ┌────▼────┐ │ │ │ GPU │ │ │ │ (Orin) │ │ │ └─────────┘ │ ▼ ▼ /app/storage /data/models ``` ## Requirements ### Hardware - **NVIDIA Jetson AGX Orin** (32GB or 64GB recommended) - ~8-12GB VRAM for SDXL - ~50GB storage for models and generated images ### Software - Docker with NVIDIA Container Runtime - JetPack 6.0+ (L4T R36.2.0+) ## Installation ### 1. Download SDXL Models (First Time Only) ```bash # Download models to shared cache (takes ~30 minutes, 13GB download) export DREAMTAIL_MODELS=/data/models ./scripts/download-models.sh ``` ### 2. Build Docker Image ```bash # Build on bigorin (AGX Orin) ./scripts/build.sh ``` ### 3. Run DreamTail ```bash # Start the service ./scripts/run.sh ``` The service will be available at `http://bigorin:8765` ## API Documentation ### POST /generate Submit an image generation job. **Request:** ```json { "prompt": "a serene landscape with mountains at sunset", "client_id": "lyra", "negative_prompt": "blurry, low quality, distorted", "params": { "width": 1024, "height": 1024, "num_inference_steps": 30, "guidance_scale": 7.5, "seed": 42 } } ``` **Response (202 Accepted):** ```json { "job_id": "550e8400-e29b-41d4-a716-446655440000", "status": "queued", "created_at": "2025-11-06T12:00:00Z", "message": "Job queued. Queue position: 0" } ``` ### GET /status/{job_id} Check job status and progress. **Response:** ```json { "job_id": "550e8400-e29b-41d4-a716-446655440000", "status": "processing", "progress": 67, "created_at": "2025-11-06T12:00:00Z", "started_at": "2025-11-06T12:00:05Z", "completed_at": null, "error": null, "client_id": "lyra", "prompt": "a serene landscape..." } ``` **Status values:** `queued`, `processing`, `completed`, `failed` ### GET /result/{job_id} Download generated image (only when status is `completed`). **Response:** PNG image file ### GET /health Service health check. **Response:** ```json { "status": "healthy", "version": "1.0.0", "model_loaded": true, "queue_size": 2, "active_jobs": 1, "uptime_seconds": 3600.5 } ``` ### GET /models Model configuration information. **Response:** ```json { "base_model": "stabilityai/stable-diffusion-xl-base-1.0", "refiner_model": null, "refiner_enabled": false, "device": "cuda", "fp16_enabled": true } ``` ## Usage Examples ### Python Client ```python import requests import time # 1. Submit generation job response = requests.post("http://bigorin:8765/generate", json={ "prompt": "a futuristic city at night with neon lights", "client_id": "lyra", "params": { "width": 1024, "height": 1024, "num_inference_steps": 30 } }) job = response.json() job_id = job["job_id"] print(f"Job submitted: {job_id}") # 2. Poll for completion while True: status = requests.get(f"http://bigorin:8765/status/{job_id}").json() print(f"Status: {status['status']} - Progress: {status['progress']}%") if status["status"] == "completed": break elif status["status"] == "failed": print(f"Error: {status['error']}") break time.sleep(2) # 3. Download result image = requests.get(f"http://bigorin:8765/result/{job_id}") with open(f"{job_id}.png", "wb") as f: f.write(image.content) print(f"Image saved: {job_id}.png") ``` ### cURL Examples ```bash # Generate image curl -X POST http://bigorin:8765/generate \ -H "Content-Type: application/json" \ -d '{ "prompt": "a cat wearing a wizard hat", "client_id": "test" }' # Check status curl http://bigorin:8765/status/YOUR_JOB_ID # Download result curl http://bigorin:8765/result/YOUR_JOB_ID -o image.png # Health check curl http://bigorin:8765/health ``` ## Configuration ### Environment Variables - `DREAMTAIL_STORAGE` - Storage directory (default: `/app/storage`) - `DREAMTAIL_MODELS` - Models cache directory (default: `/app/models`) - `LOG_LEVEL` - Logging level (default: `INFO`) ### config.py Settings Key configuration parameters in `config.py`: - `DEFAULT_STEPS`: 30 (20-50 recommended for SDXL) - `MAX_CONCURRENT_JOBS`: 1 (Orin handles 1 SDXL job at a time) - `IMAGE_RETENTION_DAYS`: 10 (auto-cleanup after 10 days) - `USE_FP16`: True (reduces VRAM to ~8GB) - `ENABLE_ATTENTION_SLICING`: True (memory optimization) ## Performance **Typical generation time on AGX Orin:** - 1024x1024, 30 steps: **~45-60 seconds** - 1024x1024, 20 steps: **~30-40 seconds** (faster, slightly lower quality) **Memory usage:** - SDXL with FP16: ~8GB VRAM - Peak with attention slicing: ~10GB VRAM ## Maintenance ### View Logs ```bash docker logs -f dreamtail ``` ### Check Storage ```bash curl http://bigorin:8765/storage ``` **Response:** ```json { "total_images": 42, "total_size_mb": 156.3, "storage_path": "/app/storage/images", "retention_days": 10 } ``` ### Manual Cleanup Images are automatically deleted after 10 days. To manually clean up: ```bash docker exec dreamtail rm -rf /app/storage/images/* ``` ### Restart Service ```bash docker restart dreamtail ``` ### Stop Service ```bash docker stop dreamtail ``` ## Troubleshooting ### Model not loading **Symptom:** `"model_loaded": false` in `/health` **Solutions:** 1. Check VRAM: `nvidia-smi` (need ~10GB free) 2. Check logs: `docker logs dreamtail` 3. Re-download models: `./scripts/download-models.sh` ### Out of memory errors **Solutions:** 1. Reduce concurrent jobs to 1 (default) 2. Enable CPU offload: Set `ENABLE_CPU_OFFLOAD=True` in `config.py` 3. Reduce image size: Use 768x768 or 512x512 ### Slow generation **Expected:** 45-60 seconds for 1024x1024 @ 30 steps **To speed up:** - Reduce steps to 20-25 (minor quality loss) - Use smaller resolution (768x768) - Ensure GPU isn't thermal throttling ## Integration with Lyra DreamTail is designed to be used by Lyra but runs independently (no NATS integration). Lyra can call DreamTail via HTTP: ```python # In Lyra's code async def generate_image_for_user(prompt: str): response = await http_client.post( "http://bigorin:8765/generate", json={"prompt": prompt, "client_id": "lyra"} ) job_id = response.json()["job_id"] # Poll until complete... # Return image to user ``` ## Project Structure ``` dreamtail/ ├── Dockerfile # Jetson-optimized container ├── requirements.txt # Python dependencies ├── config.py # Configuration ├── main.py # FastAPI app + worker ├── api/ │ ├── models.py # Pydantic schemas │ └── routes.py # API endpoints ├── worker/ │ ├── generator.py # SDXL pipeline │ └── queue_manager.py # Job queue ├── storage/ │ ├── file_manager.py # Image storage │ └── cleanup_task.py # Periodic cleanup └── scripts/ ├── build.sh # Build Docker image ├── run.sh # Run container └── download-models.sh # Download SDXL ``` ## License This project is part of the Lyra ecosystem. For internal use. ## Support For issues or questions: - Check logs: `docker logs -f dreamtail` - Check health: `curl http://bigorin:8765/health` - Review configuration in `config.py` --- **Built with ❤️ for the Lyra project**