DreamTail v1.0.0 with IP-Adapter FaceID support

- SDXL image generation using RealVisXL_V4.0 - IP-Adapter FaceID integration for consistent face generation - Simplified API (removed client_id requirement) - New params: face_image, face_strength - 'vixy' shortcut for face-locked generation - Queue-based async job processing - FastAPI with proper error handling Co-authored-by: Alex <alex@k4zka.online>
2026-01-01 19:54:59 -06:00
commit e4294b57e6
18 changed files with 1895 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,34 @@
 # Python
 __pycache__/
 *.py[cod]
 *$py.class
 *.so
 .Python
 venv/
 env/
 .venv/
 # IDE
 .idea/
 .vscode/
 *.swp
 *.swo
 # OS
 .DS_Store
 Thumbs.db
 # DreamTail specific
 dreamtail_storage/images/
 *.png
 *.jpg
 *.jpeg
 # Models (too large for git)
 models/
 *.bin
 *.safetensors
 *.ckpt
 # Logs
 *.log
--- a/48
+++ b/48
@@ -0,0 +1,48 @@
 # DreamTail - SDXL Image Generation Service for NVIDIA Jetson AGX Orin
 # Based on NVIDIA L4T PyTorch container optimized for Jetson
 # Try the jetson-containers format (alternative: nvcr.io/nvidia/l4t-pytorch:r35.2.1-pth2.0-py3)
 FROM dustynv/pytorch:2.1-r36.2.0
 # Set working directory
 WORKDIR /app
 # Install system dependencies
 RUN apt-get update && apt-get install -y \
    git \
    wget \
    libgl1-mesa-glx \
    libglib2.0-0 \
    && rm -rf /var/lib/apt/lists/*
 # Copy requirements first for better caching
 COPY requirements.txt /app/
 # Install Python dependencies
 # Note: torch and torchvision are already in the base image
 RUN pip3 install --no-cache-dir -r requirements.txt
 # Copy application code
 COPY config.py /app/
 COPY main.py /app/
 COPY api/ /app/api/
 COPY worker/ /app/worker/
 COPY dreamtail_storage/ /app/dreamtail_storage/
 # Create storage directories
 RUN mkdir -p /app/storage/images /app/models
 # Expose API port
 EXPOSE 8765
 # Set environment variables
 ENV PYTHONUNBUFFERED=1
 ENV DREAMTAIL_STORAGE=/app/storage
 ENV DREAMTAIL_MODELS=/app/models
 # Health check
 HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
    CMD curl -f http://localhost:8765/health || exit 1
 # Run the FastAPI application
 CMD ["python3", "main.py"]
--- a/README.md
+++ b/README.md
@@ -0,0 +1,386 @@
 # 🎨 DreamTail
 **SDXL Image Generation Service for NVIDIA Jetson AGX Orin**
 DreamTail is a standalone FastAPI service that provides high-quality image generation using Stable Diffusion XL (SDXL), optimized for NVIDIA Jetson AGX Orin. It's designed to be used by multiple clients (Lyra, Vixy, etc.) through a simple REST API with job queue management.
 ## Features
 - ✨ **SDXL (Stable Diffusion XL)** for high-quality 1024x1024 image generation
 - 🚀 **Jetson-optimized** with FP16, attention slicing, and VAE slicing
 - 📋 **Job queue system** with async processing
 - 🔄 **Multi-client support** (Lyra, Vixy, and more)
 - 💾 **Automatic cleanup** (images deleted after 10 days)
 - 🔍 **Progress tracking** via REST API
 - 🏥 **Health monitoring** and statistics
 ## Architecture
 ```
 ┌─────────────┐
 │   Clients   │ (Lyra, Vixy, etc.)
 └──────┬──────┘
       │ HTTP/REST
       ▼
 ┌──────────────────────┐
 │   FastAPI Server     │
 │  (Port 8765)         │
 └──────┬───────────────┘
       │
 ┌──────▼─────┬─────────┬──────────┐
 │  Job Queue │ SDXL    │ Storage  │
 │  Manager   │ Worker  │ Manager  │
 └────────────┴─────────┴──────────┘
       │          │           │
       │     ┌────▼────┐      │
       │     │  GPU    │      │
       │     │ (Orin)  │      │
       │     └─────────┘      │
       ▼                      ▼
  /app/storage          /data/models
 ```
 ## Requirements
 ### Hardware
 - **NVIDIA Jetson AGX Orin** (32GB or 64GB recommended)
 - ~8-12GB VRAM for SDXL
 - ~50GB storage for models and generated images
 ### Software
 - Docker with NVIDIA Container Runtime
 - JetPack 6.0+ (L4T R36.2.0+)
 ## Installation
 ### 1. Download SDXL Models (First Time Only)
 ```bash
 # Download models to shared cache (takes ~30 minutes, 13GB download)
 export DREAMTAIL_MODELS=/data/models
 ./scripts/download-models.sh
 ```
 ### 2. Build Docker Image
 ```bash
 # Build on bigorin (AGX Orin)
 ./scripts/build.sh
 ```
 ### 3. Run DreamTail
 ```bash
 # Start the service
 ./scripts/run.sh
 ```
 The service will be available at `http://bigorin:8765`
 ## API Documentation
 ### POST /generate
 Submit an image generation job.
 **Request:**
 ```json
 {
  "prompt": "a serene landscape with mountains at sunset",
  "client_id": "lyra",
  "negative_prompt": "blurry, low quality, distorted",
  "params": {
    "width": 1024,
    "height": 1024,
    "num_inference_steps": 30,
    "guidance_scale": 7.5,
    "seed": 42
  }
 }
 ```
 **Response (202 Accepted):**
 ```json
 {
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued",
  "created_at": "2025-11-06T12:00:00Z",
  "message": "Job queued. Queue position: 0"
 }
 ```
 ### GET /status/{job_id}
 Check job status and progress.
 **Response:**
 ```json
 {
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "processing",
  "progress": 67,
  "created_at": "2025-11-06T12:00:00Z",
  "started_at": "2025-11-06T12:00:05Z",
  "completed_at": null,
  "error": null,
  "client_id": "lyra",
  "prompt": "a serene landscape..."
 }
 ```
 **Status values:** `queued`, `processing`, `completed`, `failed`
 ### GET /result/{job_id}
 Download generated image (only when status is `completed`).
 **Response:** PNG image file
 ### GET /health
 Service health check.
 **Response:**
 ```json
 {
  "status": "healthy",
  "version": "1.0.0",
  "model_loaded": true,
  "queue_size": 2,
  "active_jobs": 1,
  "uptime_seconds": 3600.5
 }
 ```
 ### GET /models
 Model configuration information.
 **Response:**
 ```json
 {
  "base_model": "stabilityai/stable-diffusion-xl-base-1.0",
  "refiner_model": null,
  "refiner_enabled": false,
  "device": "cuda",
  "fp16_enabled": true
 }
 ```
 ## Usage Examples
 ### Python Client
 ```python
 import requests
 import time
 # 1. Submit generation job
 response = requests.post("http://bigorin:8765/generate", json={
    "prompt": "a futuristic city at night with neon lights",
    "client_id": "lyra",
    "params": {
        "width": 1024,
        "height": 1024,
        "num_inference_steps": 30
    }
 })
 job = response.json()
 job_id = job["job_id"]
 print(f"Job submitted: {job_id}")
 # 2. Poll for completion
 while True:
    status = requests.get(f"http://bigorin:8765/status/{job_id}").json()
    print(f"Status: {status['status']} - Progress: {status['progress']}%")
    if status["status"] == "completed":
        break
    elif status["status"] == "failed":
        print(f"Error: {status['error']}")
        break
    time.sleep(2)
 # 3. Download result
 image = requests.get(f"http://bigorin:8765/result/{job_id}")
 with open(f"{job_id}.png", "wb") as f:
    f.write(image.content)
 print(f"Image saved: {job_id}.png")
 ```
 ### cURL Examples
 ```bash
 # Generate image
 curl -X POST http://bigorin:8765/generate \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a cat wearing a wizard hat",
    "client_id": "test"
  }'
 # Check status
 curl http://bigorin:8765/status/YOUR_JOB_ID
 # Download result
 curl http://bigorin:8765/result/YOUR_JOB_ID -o image.png
 # Health check
 curl http://bigorin:8765/health
 ```
 ## Configuration
 ### Environment Variables
 - `DREAMTAIL_STORAGE` - Storage directory (default: `/app/storage`)
 - `DREAMTAIL_MODELS` - Models cache directory (default: `/app/models`)
 - `LOG_LEVEL` - Logging level (default: `INFO`)
 ### config.py Settings
 Key configuration parameters in `config.py`:
 - `DEFAULT_STEPS`: 30 (20-50 recommended for SDXL)
 - `MAX_CONCURRENT_JOBS`: 1 (Orin handles 1 SDXL job at a time)
 - `IMAGE_RETENTION_DAYS`: 10 (auto-cleanup after 10 days)
 - `USE_FP16`: True (reduces VRAM to ~8GB)
 - `ENABLE_ATTENTION_SLICING`: True (memory optimization)
 ## Performance
 **Typical generation time on AGX Orin:**
 - 1024x1024, 30 steps: **~45-60 seconds**
 - 1024x1024, 20 steps: **~30-40 seconds** (faster, slightly lower quality)
 **Memory usage:**
 - SDXL with FP16: ~8GB VRAM
 - Peak with attention slicing: ~10GB VRAM
 ## Maintenance
 ### View Logs
 ```bash
 docker logs -f dreamtail
 ```
 ### Check Storage
 ```bash
 curl http://bigorin:8765/storage
 ```
 **Response:**
 ```json
 {
  "total_images": 42,
  "total_size_mb": 156.3,
  "storage_path": "/app/storage/images",
  "retention_days": 10
 }
 ```
 ### Manual Cleanup
 Images are automatically deleted after 10 days. To manually clean up:
 ```bash
 docker exec dreamtail rm -rf /app/storage/images/*
 ```
 ### Restart Service
 ```bash
 docker restart dreamtail
 ```
 ### Stop Service
 ```bash
 docker stop dreamtail
 ```
 ## Troubleshooting
 ### Model not loading
 **Symptom:** `"model_loaded": false` in `/health`
 **Solutions:**
 1. Check VRAM: `nvidia-smi` (need ~10GB free)
 2. Check logs: `docker logs dreamtail`
 3. Re-download models: `./scripts/download-models.sh`
 ### Out of memory errors
 **Solutions:**
 1. Reduce concurrent jobs to 1 (default)
 2. Enable CPU offload: Set `ENABLE_CPU_OFFLOAD=True` in `config.py`
 3. Reduce image size: Use 768x768 or 512x512
 ### Slow generation
 **Expected:** 45-60 seconds for 1024x1024 @ 30 steps
 **To speed up:**
 - Reduce steps to 20-25 (minor quality loss)
 - Use smaller resolution (768x768)
 - Ensure GPU isn't thermal throttling
 ## Integration with Lyra
 DreamTail is designed to be used by Lyra but runs independently (no NATS integration). Lyra can call DreamTail via HTTP:
 ```python
 # In Lyra's code
 async def generate_image_for_user(prompt: str):
    response = await http_client.post(
        "http://bigorin:8765/generate",
        json={"prompt": prompt, "client_id": "lyra"}
    )
    job_id = response.json()["job_id"]
    # Poll until complete...
    # Return image to user
 ```
 ## Project Structure
 ```
 dreamtail/
 ├── Dockerfile                # Jetson-optimized container
 ├── requirements.txt          # Python dependencies
 ├── config.py                 # Configuration
 ├── main.py                   # FastAPI app + worker
 ├── api/
 │   ├── models.py            # Pydantic schemas
 │   └── routes.py            # API endpoints
 ├── worker/
 │   ├── generator.py         # SDXL pipeline
 │   └── queue_manager.py     # Job queue
 ├── storage/
 │   ├── file_manager.py      # Image storage
 │   └── cleanup_task.py      # Periodic cleanup
 └── scripts/
    ├── build.sh             # Build Docker image
    ├── run.sh               # Run container
    └── download-models.sh   # Download SDXL
 ```
 ## License
 This project is part of the Lyra ecosystem. For internal use.
 ## Support
 For issues or questions:
 - Check logs: `docker logs -f dreamtail`
 - Check health: `curl http://bigorin:8765/health`
 - Review configuration in `config.py`
 ---
 **Built with ❤️ for the Lyra project**
--- a/api/init.py
+++ b/api/init.py
@@ -0,0 +1 @@
 """API modules for DreamTail."""
--- a/api/models.py
+++ b/api/models.py
@@ -0,0 +1,73 @@
 """
 Pydantic models for API requests and responses.
 """
 from typing import Optional, Dict, Any, Literal
 from pydantic import BaseModel, Field, validator
 from datetime import datetime
 import config
 class GenerationParams(BaseModel):
    """Optional generation parameters."""
    width: int = Field(default=config.DEFAULT_WIDTH, ge=512, le=2048)
    height: int = Field(default=config.DEFAULT_HEIGHT, ge=512, le=2048)
    num_inference_steps: int = Field(default=config.DEFAULT_STEPS, ge=config.MIN_STEPS, le=config.MAX_STEPS)
    guidance_scale: float = Field(default=config.DEFAULT_GUIDANCE_SCALE, ge=config.MIN_GUIDANCE, le=config.MAX_GUIDANCE)
    seed: Optional[int] = Field(default=None, description="Random seed for reproducibility")
    face_image: Optional[str] = Field(default=None, description="Face reference image name (from faces directory) or 'vixy' for default")
    face_strength: float = Field(default=config.DEFAULT_FACE_STRENGTH, ge=0.0, le=1.0, description="Face conditioning strength (0.0-1.0)")
    @validator('width', 'height')
    def must_be_multiple_of_8(cls, v):
        if v % 8 != 0:
            raise ValueError('Width and height must be multiples of 8')
        return v
 class GenerateRequest(BaseModel):
    """Request to generate an image."""
    prompt: str = Field(..., min_length=1, max_length=2000, description="Text prompt for image generation")
    negative_prompt: Optional[str] = Field(default=None, max_length=2000, description="Negative prompt to avoid certain features")
    params: Optional[GenerationParams] = Field(default_factory=GenerationParams)
 class JobResponse(BaseModel):
    """Response when submitting a generation job."""
    job_id: str = Field(..., description="Unique job identifier")
    status: Literal["queued", "processing", "completed", "failed"] = Field(..., description="Current job status")
    created_at: datetime = Field(..., description="Job creation timestamp")
    message: Optional[str] = Field(default=None, description="Optional message")
 class JobStatus(BaseModel):
    """Detailed job status information."""
    job_id: str
    status: Literal["queued", "processing", "completed", "failed"]
    progress: int = Field(..., ge=0, le=100, description="Progress percentage (0-100)")
    created_at: datetime
    started_at: Optional[datetime] = None
    completed_at: Optional[datetime] = None
    error: Optional[str] = None
    prompt: str
 class HealthResponse(BaseModel):
    """Health check response."""
    model_config = {"protected_namespaces": ()}  # Allow "model_" prefix
    status: Literal["healthy", "unhealthy"]
    version: str
    model_loaded: bool
    queue_size: int
    active_jobs: int
    uptime_seconds: float
 class ModelsResponse(BaseModel):
    """Available models information."""
    base_model: str
    refiner_model: Optional[str] = None
    refiner_enabled: bool
    device: str
    fp16_enabled: bool
--- a/api/routes.py
+++ b/api/routes.py
@@ -0,0 +1,166 @@
 """
 API Routes
 FastAPI routes for image generation service.
 """
 import logging
 from fastapi import APIRouter, HTTPException, Response
 from fastapi.responses import FileResponse
 import asyncio
 from api.models import (
    GenerateRequest, JobResponse, JobStatus,
    HealthResponse, ModelsResponse
 )
 from worker.queue_manager import queue_manager
 from worker.generator import generator
 from dreamtail_storage.file_manager import file_manager
 import config
 logger = logging.getLogger(__name__)
 router = APIRouter()
@router.post("/generate", response_model=JobResponse, status_code=202)
 async def generate_image(request: GenerateRequest):
    """
    Submit an image generation job.
    Returns immediately with a job_id. Use /status/{job_id} to check progress.
    """
    try:
        # Submit job to queue
        job_id = await queue_manager.submit_job(
            prompt=request.prompt,
            negative_prompt=request.negative_prompt,
            params=request.params.dict() if request.params else {}
        )
        job = queue_manager.get_job(job_id)
        return JobResponse(
            job_id=job_id,
            status=job.status,
            created_at=job.created_at,
            message=f"Job queued. Queue position: {queue_manager.get_queue_size()}"
        )
    except asyncio.QueueFull:
        raise HTTPException(
            status_code=503,
            detail=f"Queue is full (max: {config.MAX_QUEUE_SIZE}). Please try again later."
        )
    except Exception as e:
        logger.error(f"Error submitting job: {e}")
        raise HTTPException(status_code=500, detail=str(e))
@router.get("/status/{job_id}", response_model=JobStatus)
 async def get_job_status(job_id: str):
    """
    Get the status of a generation job.
    Returns job progress, status, and timestamps.
    """
    job = queue_manager.get_job(job_id)
    if not job:
        raise HTTPException(status_code=404, detail="Job not found")
    return JobStatus(
        job_id=job.job_id,
        status=job.status,
        progress=job.progress,
        created_at=job.created_at,
        started_at=job.started_at,
        completed_at=job.completed_at,
        error=job.error,
        prompt=job.prompt
    )
@router.get("/result/{job_id}")
 async def get_result(job_id: str):
    """
    Download the generated image for a completed job.
    Returns the image file as PNG.
    """
    job = queue_manager.get_job(job_id)
    if not job:
        raise HTTPException(status_code=404, detail="Job not found")
    if job.status != "completed":
        raise HTTPException(
            status_code=400,
            detail=f"Job is {job.status}, not completed. Check /status/{job_id}"
        )
    # Check if image file exists
    image_path = file_manager.get_image_path(job_id)
    if not image_path:
        raise HTTPException(
            status_code=404,
            detail="Image file not found (may have been cleaned up)"
        )
    # Return image file
    return FileResponse(
        path=image_path,
        media_type="image/png",
        filename=f"{job_id}.png"
    )
@router.get("/health", response_model=HealthResponse)
 async def health_check():
    """
    Health check endpoint.
    Returns service status and basic statistics.
    """
    import time
    from main import start_time
    return HealthResponse(
        status="healthy" if generator.model_loaded else "unhealthy",
        version=config.APP_VERSION,
        model_loaded=generator.model_loaded,
        queue_size=queue_manager.get_queue_size(),
        active_jobs=queue_manager.get_active_jobs(),
        uptime_seconds=time.time() - start_time
    )
@router.get("/models", response_model=ModelsResponse)
 async def get_models_info():
    """
    Get information about loaded models and configuration.
    """
    model_info = generator.get_model_info()
    return ModelsResponse(
        base_model=config.SDXL_MODEL_ID,
        refiner_model=config.SDXL_REFINER_ID if config.USE_REFINER else None,
        refiner_enabled=config.USE_REFINER,
        device=model_info.get("device", "unknown"),
        fp16_enabled=config.USE_FP16
    )
@router.get("/storage")
 async def get_storage_info():
    """
    Get storage statistics (admin endpoint).
    """
    stats = file_manager.get_storage_stats()
    return {
        "total_images": stats["total_images"],
        "total_size_mb": round(stats["total_size_mb"], 2),
        "storage_path": stats["storage_path"],
        "retention_days": config.IMAGE_RETENTION_DAYS
    }
--- a/config.py
+++ b/config.py
@@ -0,0 +1,72 @@
 """
 DreamTail Configuration
 Configuration for SDXL image generation service running on AGX Orin.
 """
 import os
 from pathlib import Path
 # Application settings
 APP_NAME = "DreamTail"
 APP_VERSION = "1.0.0"
 API_HOST = "0.0.0.0"
 API_PORT = 8765
 # Paths
 BASE_DIR = Path(__file__).parent
 STORAGE_DIR = Path(os.getenv("DREAMTAIL_STORAGE", "/app/storage"))
 MODELS_DIR = Path(os.getenv("DREAMTAIL_MODELS", "/app/models"))
 IMAGES_DIR = STORAGE_DIR / "images"
 # Ensure directories exist
 STORAGE_DIR.mkdir(parents=True, exist_ok=True)
 IMAGES_DIR.mkdir(parents=True, exist_ok=True)
 # Model settings
 SDXL_MODEL_ID = "SG161222/RealVisXL_V4.0"
 SDXL_REFINER_ID = "stabilityai/stable-diffusion-xl-refiner-1.0"
 USE_REFINER = False  # Set to True to enable refiner (requires more VRAM)
 # IP-Adapter FaceID settings
 IP_ADAPTER_DIR = Path(os.getenv("DREAMTAIL_IP_ADAPTER", MODELS_DIR / "ip-adapter"))
 IP_ADAPTER_PATH = IP_ADAPTER_DIR / "ip-adapter-faceid_sdxl.bin"
 FACE_REFERENCE_DIR = STORAGE_DIR / "faces"  # Directory for face reference images
 DEFAULT_FACE_STRENGTH = 0.6  # How strongly to apply face conditioning (0.0-1.0)
 # Ensure IP-Adapter directories exist
 IP_ADAPTER_DIR.mkdir(parents=True, exist_ok=True)
 FACE_REFERENCE_DIR.mkdir(parents=True, exist_ok=True)
 # Generation defaults
 DEFAULT_WIDTH = 1024
 DEFAULT_HEIGHT = 1024
 DEFAULT_STEPS = 30
 DEFAULT_GUIDANCE_SCALE = 7.5
 MIN_STEPS = 10
 MAX_STEPS = 100
 MIN_GUIDANCE = 1.0
 MAX_GUIDANCE = 20.0
 # Performance settings
 MAX_CONCURRENT_JOBS = 1  # AGX Orin can handle 1 SDXL generation at a time
 ENABLE_ATTENTION_SLICING = True
 ENABLE_VAE_SLICING = True
 ENABLE_CPU_OFFLOAD = False  # Only if VRAM is insufficient
 USE_FP16 = True  # Half precision for reduced VRAM usage
 # Queue settings
 MAX_QUEUE_SIZE = 50  # Maximum queued jobs
 JOB_TIMEOUT_SECONDS = 300  # 5 minutes max per job
 # Storage settings
 IMAGE_RETENTION_DAYS = 10
 CLEANUP_INTERVAL_HOURS = 24
 IMAGE_FORMAT = "PNG"
 IMAGE_QUALITY = 95  # For JPEG (not used for PNG)
 # Logging
 LOG_LEVEL = os.getenv("LOG_LEVEL", "INFO")
 LOG_FORMAT = "[%(asctime)s] %(levelname)s [%(name)s] %(message)s"
 LOG_DATE_FORMAT = "%H:%M:%S"
--- a/dreamtail_storage/init.py
+++ b/dreamtail_storage/init.py
@@ -0,0 +1 @@
 """Storage management for DreamTail."""
--- a/dreamtail_storage/cleanup_task.py
+++ b/dreamtail_storage/cleanup_task.py
@@ -0,0 +1,111 @@
 """
 Cleanup Task
 Periodically deletes images older than the retention period.
 """
 import asyncio
 import logging
 from datetime import datetime, timedelta
 from pathlib import Path
 import config
 logger = logging.getLogger(__name__)
 class CleanupTask:
    """Background task to clean up old images."""
    def __init__(self):
        self.images_dir = config.IMAGES_DIR
        self.retention_days = config.IMAGE_RETENTION_DAYS
        self.interval_hours = config.CLEANUP_INTERVAL_HOURS
        self.running = False
        self._task = None
    async def start(self):
        """Start the cleanup task."""
        if self.running:
            logger.warning("Cleanup task already running")
            return
        self.running = True
        self._task = asyncio.create_task(self._run())
        logger.info(f"Cleanup task started (retention: {self.retention_days} days, interval: {self.interval_hours}h)")
    async def stop(self):
        """Stop the cleanup task."""
        if not self.running:
            return
        self.running = False
        if self._task:
            self._task.cancel()
            try:
                await self._task
            except asyncio.CancelledError:
                pass
        logger.info("Cleanup task stopped")
    async def _run(self):
        """Main cleanup loop."""
        while self.running:
            try:
                await self._cleanup_old_images()
                # Sleep for the configured interval
                await asyncio.sleep(self.interval_hours * 3600)
            except asyncio.CancelledError:
                break
            except Exception as e:
                logger.error(f"Error in cleanup task: {e}")
                await asyncio.sleep(300)  # Wait 5 minutes before retry
    async def _cleanup_old_images(self):
        """Delete images older than retention period."""
        try:
            cutoff_time = datetime.now() - timedelta(days=self.retention_days)
            cutoff_timestamp = cutoff_time.timestamp()
            deleted_count = 0
            deleted_size = 0
            # Find all image files
            image_files = list(self.images_dir.glob(f"*.{config.IMAGE_FORMAT.lower()}"))
            for file_path in image_files:
                try:
                    # Check file modification time
                    file_mtime = file_path.stat().st_mtime
                    if file_mtime < cutoff_timestamp:
                        file_size = file_path.stat().st_size
                        file_path.unlink()
                        deleted_count += 1
                        deleted_size += file_size
                        logger.debug(f"Deleted old image: {file_path.name}")
                except Exception as e:
                    logger.error(f"Error deleting {file_path.name}: {e}")
            if deleted_count > 0:
                logger.info(
                    f"Cleanup completed: deleted {deleted_count} images "
                    f"({deleted_size / 1024 / 1024:.1f} MB) older than {self.retention_days} days"
                )
            else:
                logger.debug(f"Cleanup completed: no images older than {self.retention_days} days")
        except Exception as e:
            logger.error(f"Error during cleanup: {e}")
    async def cleanup_now(self):
        """Trigger immediate cleanup (for testing or manual trigger)."""
        logger.info("Manual cleanup triggered")
        await self._cleanup_old_images()
 # Global cleanup task instance
 cleanup_task = CleanupTask()
--- a/dreamtail_storage/file_manager.py
+++ b/dreamtail_storage/file_manager.py
@@ -0,0 +1,132 @@
 """
 File Storage Manager
 Handles saving and retrieving generated images.
 """
 import logging
 from pathlib import Path
 from typing import Optional
 from PIL import Image
 import aiofiles
 import os
 import config
 logger = logging.getLogger(__name__)
 class FileManager:
    """Manages image file storage and retrieval."""
    def __init__(self):
        self.images_dir = config.IMAGES_DIR
        self.image_format = config.IMAGE_FORMAT
        # Ensure storage directory exists
        self.images_dir.mkdir(parents=True, exist_ok=True)
        logger.info(f"Image storage directory: {self.images_dir}")
    async def save_image(self, job_id: str, image: Image.Image) -> str:
        """
        Save generated image to disk.
        Args:
            job_id: Job identifier (used as filename)
            image: PIL Image to save
        Returns:
            file_path: Absolute path to saved image
        Raises:
            IOError: If save fails
        """
        filename = f"{job_id}.{self.image_format.lower()}"
        file_path = self.images_dir / filename
        try:
            # Save in thread pool to avoid blocking
            import asyncio
            loop = asyncio.get_event_loop()
            await loop.run_in_executor(
                None,
                lambda: image.save(
                    file_path,
                    format=self.image_format,
                    quality=config.IMAGE_QUALITY if self.image_format == "JPEG" else None
                )
            )
            logger.info(f"Image saved: {file_path} ({os.path.getsize(file_path) / 1024:.1f} KB)")
            return str(file_path)
        except Exception as e:
            logger.error(f"Failed to save image {job_id}: {e}")
            raise IOError(f"Failed to save image: {e}")
    def get_image_path(self, job_id: str) -> Optional[Path]:
        """
        Get path to image file if it exists.
        Args:
            job_id: Job identifier
        Returns:
            Path to image file or None if not found
        """
        filename = f"{job_id}.{self.image_format.lower()}"
        file_path = self.images_dir / filename
        if file_path.exists():
            return file_path
        return None
    def image_exists(self, job_id: str) -> bool:
        """Check if image file exists."""
        return self.get_image_path(job_id) is not None
    async def delete_image(self, job_id: str) -> bool:
        """
        Delete an image file.
        Args:
            job_id: Job identifier
        Returns:
            True if deleted, False if not found
        """
        file_path = self.get_image_path(job_id)
        if file_path:
            try:
                file_path.unlink()
                logger.info(f"Deleted image: {file_path}")
                return True
            except Exception as e:
                logger.error(f"Failed to delete image {job_id}: {e}")
                return False
        return False
    def get_storage_stats(self) -> dict:
        """Get storage statistics."""
        try:
            files = list(self.images_dir.glob(f"*.{self.image_format.lower()}"))
            total_size = sum(f.stat().st_size for f in files)
            return {
                "total_images": len(files),
                "total_size_mb": total_size / (1024 * 1024),
                "storage_path": str(self.images_dir)
            }
        except Exception as e:
            logger.error(f"Failed to get storage stats: {e}")
            return {
                "total_images": 0,
                "total_size_mb": 0,
                "storage_path": str(self.images_dir)
            }
 # Global file manager instance
 file_manager = FileManager()
--- a/main.py
+++ b/main.py
@@ -0,0 +1,216 @@
 """
 DreamTail - SDXL Image Generation Service
 Main application entry point.
 """
 import sys
 import os
 from pathlib import Path
 # Add app directory to Python path
 sys.path.insert(0, str(Path(__file__).parent))
 import asyncio
 import logging
 import time
 from contextlib import asynccontextmanager
 from fastapi import FastAPI
 from fastapi.middleware.cors import CORSMiddleware
 import uvicorn
 from api.routes import router
 from worker.queue_manager import queue_manager
 from worker.generator import generator
 from dreamtail_storage.file_manager import file_manager
 from dreamtail_storage.cleanup_task import cleanup_task
 import config
 # Configure logging
 logging.basicConfig(
    level=getattr(logging, config.LOG_LEVEL),
    format=config.LOG_FORMAT,
    datefmt=config.LOG_DATE_FORMAT
 )
 logger = logging.getLogger(__name__)
 # Track application start time
 start_time = time.time()
 async def process_jobs():
    """
    Background worker that processes jobs from the queue.
    """
    logger.info("Job processor started")
    while True:
        try:
            # Get next job from queue (blocks until available)
            job_id = await queue_manager.get_next_job()
            if not job_id:
                continue
            job = queue_manager.get_job(job_id)
            if not job:
                logger.warning(f"Job {job_id} not found in queue")
                continue
            # Mark job as started
            await queue_manager.start_job(job_id)
            logger.info(f"Processing job {job_id}")
            try:
                # Progress callback
                async def update_progress(progress: int):
                    await queue_manager.update_progress(job_id, progress)
                # Resolve face image path if specified
                face_image = job.params.get("face_image")
                if face_image:
                    # Handle "vixy" shortcut for default face
                    if face_image.lower() == "vixy":
                        # Use all faces in the vixy directory
                        vixy_faces = list(config.FACE_REFERENCE_DIR.glob("*.jpg")) + \
                                    list(config.FACE_REFERENCE_DIR.glob("*.png"))
                        if vixy_faces:
                            face_image = [str(f) for f in vixy_faces]
                            logger.info(f"Using {len(face_image)} Vixy reference faces")
                        else:
                            logger.warning("No Vixy faces found, generating without face lock")
                            face_image = None
                    else:
                        # Look for specific face file
                        face_path = config.FACE_REFERENCE_DIR / face_image
                        if face_path.exists():
                            face_image = str(face_path)
                        else:
                            logger.warning(f"Face image {face_image} not found, generating without face lock")
                            face_image = None
                # Generate image
                image = await generator.generate_image(
                    prompt=job.prompt,
                    negative_prompt=job.negative_prompt,
                    width=job.params.get("width", config.DEFAULT_WIDTH),
                    height=job.params.get("height", config.DEFAULT_HEIGHT),
                    num_inference_steps=job.params.get("num_inference_steps", config.DEFAULT_STEPS),
                    guidance_scale=job.params.get("guidance_scale", config.DEFAULT_GUIDANCE_SCALE),
                    seed=job.params.get("seed"),
                    progress_callback=update_progress,
                    face_image=face_image,
                    face_strength=job.params.get("face_strength", config.DEFAULT_FACE_STRENGTH),
                )
                # Save image to disk
                result_path = await file_manager.save_image(job_id, image)
                # Mark job as completed
                await queue_manager.complete_job(job_id, result_path)
                logger.info(f"Job {job_id} completed successfully")
            except Exception as e:
                logger.error(f"Job {job_id} failed: {e}")
                await queue_manager.fail_job(job_id, str(e))
        except asyncio.CancelledError:
            logger.info("Job processor cancelled")
            break
        except Exception as e:
            logger.error(f"Error in job processor: {e}")
            await asyncio.sleep(5)  # Wait before retrying
@asynccontextmanager
 async def lifespan(app: FastAPI):
    """
    Lifespan context manager for startup and shutdown.
    """
    # Startup
    logger.info(f"🎨 Starting {config.APP_NAME} v{config.APP_VERSION}")
    logger.info(f"Storage: {config.STORAGE_DIR}")
    logger.info(f"Models: {config.MODELS_DIR}")
    # Initialize SDXL model
    try:
        await generator.initialize()
    except Exception as e:
        logger.error(f"Failed to initialize SDXL model: {e}")
        logger.warning("Service will start but generation will fail until model is loaded")
    # Start background tasks
    worker_task = asyncio.create_task(process_jobs())
    await cleanup_task.start()
    logger.info(f"✅ {config.APP_NAME} ready on http://{config.API_HOST}:{config.API_PORT}")
    yield
    # Shutdown
    logger.info(f"🛑 Shutting down {config.APP_NAME}...")
    # Cancel worker task
    worker_task.cancel()
    try:
        await worker_task
    except asyncio.CancelledError:
        pass
    # Stop cleanup task
    await cleanup_task.stop()
    logger.info("✅ Shutdown complete")
 # Create FastAPI application
 app = FastAPI(
    title=config.APP_NAME,
    version=config.APP_VERSION,
    description="SDXL Image Generation Service for Jetson AGX Orin",
    lifespan=lifespan
 )
 # Add CORS middleware (allow all origins for multi-client support)
 app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
 )
 # Include API routes
 app.include_router(router)
@app.get("/")
 async def root():
    """Root endpoint with service information."""
    return {
        "service": config.APP_NAME,
        "version": config.APP_VERSION,
        "status": "running",
        "model": config.SDXL_MODEL_ID,
        "endpoints": {
            "generate": "POST /generate",
            "status": "GET /status/{job_id}",
            "result": "GET /result/{job_id}",
            "health": "GET /health",
            "models": "GET /models"
        }
    }
 if __name__ == "__main__":
    # Run with uvicorn
    uvicorn.run(
        "main:app",
        host=config.API_HOST,
        port=config.API_PORT,
        log_level=config.LOG_LEVEL.lower()
    )
--- a/requirements.txt
+++ b/requirements.txt
@@ -0,0 +1,35 @@
 # DreamTail Dependencies
 # Web framework
 fastapi==0.109.0
 uvicorn[standard]==0.27.0
 python-multipart==0.0.6
 # SDXL / Stable Diffusion (upgraded for compatibility)
 diffusers==0.27.0
 transformers==4.38.0
 accelerate==0.27.0
 safetensors==0.4.2
 huggingface_hub==0.21.0
 omegaconf==2.3.0
 # PyTorch (installed in Jetson container, but listed for reference)
 # torch==2.1.0+cu121 (from Jetson L4T)
 # torchvision==0.16.0+cu121
 # Image processing
 Pillow==10.2.0
 opencv-python==4.9.0.80
 # IP-Adapter FaceID (for consistent face generation)
 insightface==0.7.3
 onnxruntime-gpu==1.17.0
 # Utilities
 pydantic==2.6.0
 pydantic-settings==2.1.0
 python-dotenv==1.0.1
 aiofiles==23.2.1
 # Monitoring (optional)
 prometheus-client==0.19.0
--- a/scripts/build.sh
+++ b/scripts/build.sh
@@ -0,0 +1,19 @@
 #!/bin/bash
 # Build DreamTail Docker image for Jetson AGX Orin
 set -e
 echo "🎨 Building DreamTail Docker image..."
 cd "$(dirname "$0")/.."
 # Build for ARM64 (Jetson architecture)
 docker build \
  --platform linux/arm64 \
  -t dreamtail:latest \
  -f Dockerfile \
  .
 echo "✅ Build complete!"
 echo ""
 echo "To run DreamTail:"
 echo "  ./scripts/run.sh"
--- a/scripts/download-models.sh
+++ b/scripts/download-models.sh
@@ -0,0 +1,53 @@
 #!/bin/bash
 # Download SDXL models for DreamTail using Docker
 set -e
 echo "📥 Downloading SDXL models..."
 echo "This will download ~13GB of model weights"
 echo ""
 # Model cache directory
 MODELS_DIR="${DREAMTAIL_MODELS:-/mnt/nvme/models}"
 # Create directory if it doesn't exist
 mkdir -p "$MODELS_DIR"
 echo "Models will be cached in: $MODELS_DIR"
 echo ""
 echo "Using Docker container to download models..."
 echo ""
 # Use L4T PyTorch container to download models
 docker run --rm -it \
  -v "${MODELS_DIR}:/models" \
  dustynv/l4t-pytorch:r36.2.0-pth2.1-py3 \
  bash -c "
    pip3 install -q diffusers transformers accelerate safetensors &&
    python3 << 'PYEOF'
 from diffusers import StableDiffusionXLPipeline
 model_id = 'stabilityai/stable-diffusion-xl-base-1.0'
 cache_dir = '/models'
 print(f'Downloading {model_id}...')
 print(f'Cache directory: {cache_dir}')
 print('')
 try:
    pipeline = StableDiffusionXLPipeline.from_pretrained(
        model_id,
        use_safetensors=True,
        cache_dir=cache_dir
    )
    print('✅ SDXL model downloaded successfully!')
 except Exception as e:
    print(f'❌ Error downloading model: {e}')
    exit(1)
 PYEOF
  "
 echo ""
 echo "✅ Model download complete!"
 echo ""
 echo "Models are cached in: $MODELS_DIR"
 echo "You can now build and run DreamTail"
--- a/scripts/run.sh
+++ b/scripts/run.sh
@@ -0,0 +1,45 @@
 #!/bin/bash
 # Run DreamTail on Jetson AGX Orin
 set -e
 echo "🎨 Starting DreamTail..."
 # Configuration
 CONTAINER_NAME="dreamtail"
 PORT=8765
 MODELS_DIR="/mnt/nvme/models"  # Models on NVMe
 STORAGE_DIR="/mnt/nvme/dreamtail"  # DreamTail storage on NVMe
 # Create storage directory if it doesn't exist
 mkdir -p "$STORAGE_DIR"
 # Stop existing container if running
 if docker ps -a --format '{{.Names}}' | grep -q "^${CONTAINER_NAME}$"; then
    echo "Stopping existing DreamTail container..."
    docker stop "$CONTAINER_NAME" 2>/dev/null || true
    docker rm "$CONTAINER_NAME" 2>/dev/null || true
 fi
 # Run container
 echo "Starting DreamTail container..."
 docker run -d \
  --name "$CONTAINER_NAME" \
  --runtime=nvidia \
  --restart unless-stopped \
  -p ${PORT}:8765 \
  -v "${MODELS_DIR}:/app/models" \
  -v "${STORAGE_DIR}:/app/storage" \
  -e DREAMTAIL_STORAGE=/app/storage \
  -e DREAMTAIL_MODELS=/app/models \
  -e LOG_LEVEL=INFO \
  dreamtail:latest
 echo "✅ DreamTail started!"
 echo ""
 echo "API available at: http://bigorin:${PORT}"
 echo ""
 echo "To check logs:"
 echo "  docker logs -f ${CONTAINER_NAME}"
 echo ""
 echo "To stop:"
 echo "  docker stop ${CONTAINER_NAME}"
--- a/worker/init.py
+++ b/worker/init.py
@@ -0,0 +1 @@
 """Worker modules for DreamTail."""
--- a/worker/generator.py
+++ b/worker/generator.py
@@ -0,0 +1,339 @@
 """
 SDXL Image Generator
 Handles image generation using Stable Diffusion XL with Jetson optimizations.
 Supports IP-Adapter FaceID for consistent face generation.
 """
 import torch
 import logging
 import cv2
 import numpy as np
 from typing import Optional, Dict, Any, List, Union
 from pathlib import Path
 from diffusers import StableDiffusionXLPipeline, DDIMScheduler
 from PIL import Image
 import asyncio
 import config
 logger = logging.getLogger(__name__)
 class SDXLGenerator:
    """SDXL image generator with optimizations for Jetson AGX Orin."""
    def __init__(self):
        self.pipeline = None
        self.device = None
        self.model_loaded = False
        self._load_lock = asyncio.Lock()
        # IP-Adapter FaceID components
        self.ip_model = None
        self.face_app = None
        self.face_embeds_cache = {}  # Cache for precomputed face embeddings
        self.ip_adapter_loaded = False
    async def initialize(self):
        """Load SDXL model with Jetson optimizations."""
        async with self._load_lock:
            if self.model_loaded:
                logger.info("Model already loaded, skipping initialization")
                return
            logger.info("Initializing SDXL model...")
            logger.info(f"Model: {config.SDXL_MODEL_ID}")
            logger.info(f"FP16: {config.USE_FP16}")
            logger.info(f"Attention slicing: {config.ENABLE_ATTENTION_SLICING}")
            # Determine device
            if torch.cuda.is_available():
                self.device = "cuda"
                logger.info(f"Using GPU: {torch.cuda.get_device_name(0)}")
                logger.info(f"VRAM available: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.2f} GB")
            else:
                self.device = "cpu"
                logger.warning("CUDA not available, using CPU (will be very slow)")
            # Load pipeline
            try:
                dtype = torch.float16 if config.USE_FP16 else torch.float32
                # Use DDIM scheduler for IP-Adapter compatibility
                noise_scheduler = DDIMScheduler(
                    num_train_timesteps=1000,
                    beta_start=0.00085,
                    beta_end=0.012,
                    beta_schedule="scaled_linear",
                    clip_sample=False,
                    set_alpha_to_one=False,
                    steps_offset=1,
                )
                self.pipeline = StableDiffusionXLPipeline.from_pretrained(
                    config.SDXL_MODEL_ID,
                    torch_dtype=dtype,
                    scheduler=noise_scheduler,
                    use_safetensors=True,
                    cache_dir=str(config.MODELS_DIR),
                    add_watermarker=False,
                )
                # Move to device
                self.pipeline = self.pipeline.to(self.device)
                # Apply optimizations
                if config.ENABLE_ATTENTION_SLICING:
                    self.pipeline.enable_attention_slicing()
                    logger.info("Attention slicing enabled")
                if config.ENABLE_VAE_SLICING:
                    self.pipeline.enable_vae_slicing()
                    logger.info("VAE slicing enabled")
                if config.ENABLE_CPU_OFFLOAD and self.device == "cuda":
                    self.pipeline.enable_model_cpu_offload()
                    logger.info("CPU offload enabled")
                self.model_loaded = True
                logger.info("SDXL model loaded successfully!")
            except Exception as e:
                logger.error(f"Failed to load SDXL model: {e}")
                raise
    async def initialize_ip_adapter(self):
        """Load IP-Adapter FaceID components (lazy loading)."""
        if self.ip_adapter_loaded:
            return
        logger.info("Initializing IP-Adapter FaceID...")
        try:
            # Import IP-Adapter components
            from ip_adapter.ip_adapter_faceid import IPAdapterFaceIDXL
            from insightface.app import FaceAnalysis
            # Initialize InsightFace for face detection/embedding
            self.face_app = FaceAnalysis(
                name="buffalo_l",
                providers=['CUDAExecutionProvider', 'CPUExecutionProvider']
            )
            self.face_app.prepare(ctx_id=0, det_size=(640, 640))
            logger.info("InsightFace initialized")
            # Load IP-Adapter FaceID model
            ip_ckpt = str(config.IP_ADAPTER_PATH)
            self.ip_model = IPAdapterFaceIDXL(
                self.pipeline,
                ip_ckpt,
                self.device
            )
            self.ip_adapter_loaded = True
            logger.info("IP-Adapter FaceID loaded successfully!")
        except ImportError as e:
            logger.warning(f"IP-Adapter dependencies not available: {e}")
            logger.warning("Face-locked generation will not be available")
        except Exception as e:
            logger.error(f"Failed to load IP-Adapter FaceID: {e}")
            raise
    def extract_face_embedding(self, image: Union[str, Path, Image.Image, np.ndarray]) -> torch.Tensor:
        """
        Extract face embedding from an image.
        Args:
            image: Path to image, PIL Image, or numpy array
        Returns:
            Face embedding tensor
        """
        if self.face_app is None:
            raise RuntimeError("InsightFace not initialized. Call initialize_ip_adapter() first.")
        # Convert to numpy array if needed
        if isinstance(image, (str, Path)):
            img_cv = cv2.imread(str(image))
        elif isinstance(image, Image.Image):
            img_cv = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
        else:
            img_cv = image
        # Detect faces and extract embedding
        faces = self.face_app.get(img_cv)
        if len(faces) == 0:
            raise ValueError("No face detected in image")
        # Use first detected face
        face_embed = torch.from_numpy(faces[0].normed_embedding).unsqueeze(0)
        logger.info(f"Face embedding extracted: shape {face_embed.shape}")
        return face_embed
    def precompute_face_embeddings(self, face_images: List[Union[str, Path]]) -> torch.Tensor:
        """
        Precompute and average face embeddings from multiple reference images.
        Args:
            face_images: List of paths to face reference images
        Returns:
            Averaged face embedding tensor
        """
        embeddings = []
        for img_path in face_images:
            try:
                embed = self.extract_face_embedding(img_path)
                embeddings.append(embed)
                logger.info(f"Extracted embedding from {img_path}")
            except Exception as e:
                logger.warning(f"Failed to extract face from {img_path}: {e}")
        if len(embeddings) == 0:
            raise ValueError("No faces could be extracted from any reference images")
        # Average the embeddings for better consistency
        avg_embedding = torch.mean(torch.stack(embeddings), dim=0)
        logger.info(f"Averaged {len(embeddings)} face embeddings")
        return avg_embedding
    async def generate_image(
        self,
        prompt: str,
        negative_prompt: Optional[str] = None,
        width: int = config.DEFAULT_WIDTH,
        height: int = config.DEFAULT_HEIGHT,
        num_inference_steps: int = config.DEFAULT_STEPS,
        guidance_scale: float = config.DEFAULT_GUIDANCE_SCALE,
        seed: Optional[int] = None,
        progress_callback = None,
        face_image: Optional[Union[str, Path, List[str]]] = None,
        face_strength: float = 0.6,
    ) -> Image.Image:
        """
        Generate an image from a text prompt.
        Args:
            prompt: Text prompt
            negative_prompt: Negative prompt
            width: Image width
            height: Image height
            num_inference_steps: Number of diffusion steps
            guidance_scale: Guidance scale
            seed: Random seed for reproducibility
            progress_callback: Optional async callback(step, total) for progress updates
            face_image: Optional path(s) to face reference image(s) for face locking
            face_strength: Strength of face conditioning (0.0-1.0, default 0.6)
        Returns:
            PIL Image
        Raises:
            RuntimeError: If model not loaded
        """
        if not self.model_loaded:
            raise RuntimeError("Model not loaded. Call initialize() first.")
        logger.info(f"Generating image: '{prompt[:50]}...'")
        logger.info(f"Parameters: {width}x{height}, steps={num_inference_steps}, guidance={guidance_scale}")
        # Check if face-locked generation requested
        use_face_id = face_image is not None
        if use_face_id:
            # Initialize IP-Adapter if needed
            await self.initialize_ip_adapter()
            if not self.ip_adapter_loaded:
                logger.warning("IP-Adapter not available, generating without face lock")
                use_face_id = False
            else:
                logger.info(f"Face-locked generation enabled, strength={face_strength}")
        # Set random seed if provided
        generator = None
        if seed is not None:
            generator = torch.Generator(device=self.device).manual_seed(seed)
            logger.info(f"Using seed: {seed}")
        try:
            loop = asyncio.get_event_loop()
            if use_face_id:
                # Extract face embedding(s)
                if isinstance(face_image, list):
                    face_embed = self.precompute_face_embeddings(face_image)
                else:
                    face_embed = self.extract_face_embedding(face_image)
                # Generate with IP-Adapter FaceID
                image = await loop.run_in_executor(
                    None,
                    lambda: self.ip_model.generate(
                        prompt=prompt,
                        negative_prompt=negative_prompt,
                        faceid_embeds=face_embed,
                        width=width,
                        height=height,
                        num_inference_steps=num_inference_steps,
                        guidance_scale=guidance_scale,
                        num_samples=1,
                        seed=seed,
                        s_scale=face_strength,
                    )[0]
                )
            else:
                # Progress callback wrapper (only for standard pipeline)
                def callback_wrapper(step: int, timestep: int, latents: torch.FloatTensor):
                    if progress_callback:
                        progress = int((step / num_inference_steps) * 100)
                        try:
                            asyncio.create_task(progress_callback(progress))
                        except:
                            pass
                # Standard generation without face lock
                image = await loop.run_in_executor(
                    None,
                    lambda: self.pipeline(
                        prompt=prompt,
                        negative_prompt=negative_prompt,
                        width=width,
                        height=height,
                        num_inference_steps=num_inference_steps,
                        guidance_scale=guidance_scale,
                        generator=generator,
                        callback=callback_wrapper,
                        callback_steps=1
                    ).images[0]
                )
            logger.info("Image generated successfully")
            return image
        except Exception as e:
            logger.error(f"Error generating image: {e}")
            raise
    def get_model_info(self) -> Dict[str, Any]:
        """Get information about the loaded model."""
        return {
            "model_id": config.SDXL_MODEL_ID,
            "device": self.device,
            "fp16": config.USE_FP16,
            "attention_slicing": config.ENABLE_ATTENTION_SLICING,
            "vae_slicing": config.ENABLE_VAE_SLICING,
            "cpu_offload": config.ENABLE_CPU_OFFLOAD,
            "loaded": self.model_loaded,
            "ip_adapter_loaded": self.ip_adapter_loaded,
        }
 # Global generator instance
 generator = SDXLGenerator()
--- a/worker/queue_manager.py
+++ b/worker/queue_manager.py
@@ -0,0 +1,163 @@
 """
 Job Queue Manager
 In-memory job queue for managing image generation requests.
 """
 import asyncio
 import uuid
 from datetime import datetime
 from typing import Dict, Optional, Literal
 from dataclasses import dataclass, field
 import logging
 import config
 logger = logging.getLogger(__name__)
@dataclass
 class Job:
    """Represents a single generation job."""
    job_id: str
    prompt: str
    negative_prompt: Optional[str]
    params: Dict
    status: Literal["queued", "processing", "completed", "failed"]
    progress: int = 0
    created_at: datetime = field(default_factory=datetime.utcnow)
    started_at: Optional[datetime] = None
    completed_at: Optional[datetime] = None
    error: Optional[str] = None
    result_path: Optional[str] = None
 class QueueManager:
    """Manages the job queue and job lifecycle."""
    def __init__(self):
        self.jobs: Dict[str, Job] = {}
        self.queue: asyncio.Queue = asyncio.Queue(maxsize=config.MAX_QUEUE_SIZE)
        self.active_jobs: int = 0
        self._lock = asyncio.Lock()
    async def submit_job(
        self,
        prompt: str,
        negative_prompt: Optional[str],
        params: Dict
    ) -> str:
        """
        Submit a new generation job.
        Args:
            prompt: Text prompt
            negative_prompt: Negative prompt
            params: Generation parameters
        Returns:
            job_id: Unique job identifier
        Raises:
            asyncio.QueueFull: If queue is at capacity
        """
        job_id = str(uuid.uuid4())
        job = Job(
            job_id=job_id,
            prompt=prompt,
            negative_prompt=negative_prompt,
            params=params,
            status="queued"
        )
        async with self._lock:
            self.jobs[job_id] = job
        # Add to queue (raises QueueFull if at capacity)
        await self.queue.put(job_id)
        logger.info(f"Job {job_id} submitted: '{prompt[:50]}...'")
        return job_id
    async def get_next_job(self) -> Optional[str]:
        """
        Get the next job from the queue (blocks until available).
        Returns:
            job_id or None if queue is empty
        """
        try:
            job_id = await self.queue.get()
            return job_id
        except asyncio.CancelledError:
            return None
    async def start_job(self, job_id: str):
        """Mark a job as started."""
        async with self._lock:
            if job_id in self.jobs:
                self.jobs[job_id].status = "processing"
                self.jobs[job_id].started_at = datetime.utcnow()
                self.active_jobs += 1
                logger.info(f"Job {job_id} started processing")
    async def update_progress(self, job_id: str, progress: int):
        """Update job progress (0-100)."""
        async with self._lock:
            if job_id in self.jobs:
                self.jobs[job_id].progress = min(100, max(0, progress))
    async def complete_job(self, job_id: str, result_path: str):
        """Mark a job as completed successfully."""
        async with self._lock:
            if job_id in self.jobs:
                self.jobs[job_id].status = "completed"
                self.jobs[job_id].completed_at = datetime.utcnow()
                self.jobs[job_id].progress = 100
                self.jobs[job_id].result_path = result_path
                self.active_jobs = max(0, self.active_jobs - 1)
                logger.info(f"Job {job_id} completed successfully")
    async def fail_job(self, job_id: str, error: str):
        """Mark a job as failed."""
        async with self._lock:
            if job_id in self.jobs:
                self.jobs[job_id].status = "failed"
                self.jobs[job_id].completed_at = datetime.utcnow()
                self.jobs[job_id].error = error
                self.active_jobs = max(0, self.active_jobs - 1)
                logger.error(f"Job {job_id} failed: {error}")
    def get_job(self, job_id: str) -> Optional[Job]:
        """Get job by ID."""
        return self.jobs.get(job_id)
    def get_queue_size(self) -> int:
        """Get current queue size."""
        return self.queue.qsize()
    def get_active_jobs(self) -> int:
        """Get number of currently processing jobs."""
        return self.active_jobs
    async def cleanup_old_jobs(self, max_age_hours: int = 24):
        """Remove old completed/failed jobs from memory."""
        cutoff = datetime.utcnow().timestamp() - (max_age_hours * 3600)
        async with self._lock:
            to_remove = []
            for job_id, job in self.jobs.items():
                if job.status in ["completed", "failed"] and job.completed_at:
                    if job.completed_at.timestamp() < cutoff:
                        to_remove.append(job_id)
            for job_id in to_remove:
                del self.jobs[job_id]
            if to_remove:
                logger.info(f"Cleaned up {len(to_remove)} old jobs from memory")
 # Global queue manager instance
 queue_manager = QueueManager()