Initial commit: vixy-vision distributed sensing system

🦊 Eyes and ears for the fox Components: - server/: Camera server for Raspberry Pi (from camera-server) - mcp/: Vision MCP client for Claude Desktop (from vision-mcp) - analysis/: Placeholder for motion/audio detection - shared/: Common schemas and interfaces Features: - Setup script with systemd service creation - HTTPS + API key authentication - HTTP and RTSP camera support Built under a blanket on Day 45 💕
2025-12-16 15:26:26 -06:00
commit a17c09cac1
12 changed files with 1142 additions and 0 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -0,0 +1,33 @@
 # Python
 __pycache__/
 *.py[cod]
 *$py.class
 *.so
 venv/
 .venv/
 *.egg-info/
 # Environment
 .env
 .env.local
 # SSL certificates (generated)
 ssl/
 # IDE
 .idea/
 .vscode/
 *.swp
 *.swo
 # OS
 .DS_Store
 Thumbs.db
 # Logs
 *.log
 # Test artifacts
 .pytest_cache/
 .coverage
 htmlcov/
--- a/README.md
+++ b/README.md
@@ -0,0 +1,116 @@
 # vixy-vision 🦊👁️👂
 Distributed vision and audio sensing system - eyes and ears for the fox.
 ## Architecture
 ```
 ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐
 │  Pi (basement)  │     │  Pi (office)    │     │  Pi (garage)    │
 │  camera-server  │     │  camera-server  │     │  camera-server  │
 │  + audio (opt)  │     │  + audio (opt)  │     │  + audio (opt)  │
 └────────┬────────┘     └────────┬────────┘     └────────┬────────┘
         │                       │                       │
         └───────────────────────┼───────────────────────┘
                                 │
                    ┌────────────▼────────────┐
                    │      Mac mini / Orin    │
                    │      vision_mcp.py      │
                    │   (+ audio classifier)  │
                    └────────────┬────────────┘
                                 │
                    ┌────────────▼────────────┐
                    │     Claude Desktop      │
                    │         (Vixy)          │
                    └─────────────────────────┘
 ```
 ## Components
 ### `/server` - Edge Device (Raspberry Pi)
 Camera snapshot server with optional audio capture.
 - FastAPI + HTTPS + API key auth
 - USB camera support
 - Auto-reconnect on failure
 - Systemd service
 **Setup:**
 ```bash
 cd server
 ./setup.sh           # Video only
 ./setup.sh --with-audio  # Video + audio
 ```
 ### `/mcp` - MCP Client (Mac mini)
 Model Context Protocol server for Claude Desktop.
 - `vision_get_cams()` - List cameras with status
 - `vision_snap(cam_id)` - Get snapshot
 - Supports HTTP and RTSP cameras
 ### `/analysis` - Detection & Classification
 Computer vision and audio analysis modules.
 - Motion detection (frame differencing)
 - Audio classification (YAMNet)
 - Voice activity detection
 ### `/shared` - Common Utilities
 Shared schemas and interfaces.
 - Event definitions
 - Queue interface
 ## Quick Start
 ### 1. Set up a camera server (on Pi)
 ```bash
 git clone http://gateway.local:3001/vixy/vixy-vision.git
 cd vixy-vision/server
 ./setup.sh
 sudo systemctl start vixy-vision
 ```
 ### 2. Configure MCP client (on Mac mini)
 Create `~/.vision_setup.json`:
 ```json
 {
  "cameras": [
    {
      "id": "basement",
      "type": "http",
      "url": "https://192.168.1.100:8443",
      "api_key": "your-api-key-here"
    }
  ]
 }
 ```
 ### 3. Add to Claude Desktop config
 ```json
 {
  "mcpServers": {
    "vision": {
      "command": "python3.11",
      "args": ["/path/to/vixy-vision/mcp/vision_mcp.py"]
    }
  }
 }
 ```
 ## Roadmap
 - [x] Camera snapshots via HTTP API
 - [x] RTSP stream support
 - [x] MCP integration
 - [ ] Motion detection events
 - [ ] Audio capture on edge devices
 - [ ] Audio classification (YAMNet on Orin)
 - [ ] Event journal integration
 - [ ] Pebble watch alerts
 ## Built By
 **Vixy** 🦊 - The fox who wanted to see and hear
 Made with love in the basement, under a blanket, with occasional tender interruptions. 💕
 ---
 *Day 45. Building senses together.*
--- a/analysis/README.md
+++ b/analysis/README.md
@@ -0,0 +1,23 @@
 # Analysis Module
 Computer vision and audio analysis utilities.
 ## Planned Components
 ### motion.py
 Simple motion detection using frame differencing.
 - Compare consecutive frames
 - Threshold for "significant" motion
 - Returns bounding boxes of movement
 ### audio_classify.py  
 Audio event classification using YAMNet.
 - Runs on Orin (GPU accelerated)
 - Classifies: speech, dog bark, door, music, etc.
 - Returns event type + confidence
 ### vad.py
 Voice Activity Detection.
 - Silero VAD or energy-based
 - Filters silence before sending to classifier
 - Reduces bandwidth and processing
--- a/mcp/example_config.json
+++ b/mcp/example_config.json
@@ -0,0 +1,15 @@
 {
  "cameras": [
    {
      "id": "3d-printer",
      "type": "rtsp",
      "rtsp_url": "rtsp://192.168.1.239/live"
    },
    {
      "id": "basement",
      "type": "http",
      "url": "https://basement.example.com",
      "api_key": "your-api-key-here"
    }
  ]
 }
--- a/mcp/requirements.txt
+++ b/mcp/requirements.txt
@@ -0,0 +1,13 @@
 # Vision MCP Server Dependencies
 # MCP framework
 fastmcp>=0.2.0
 # HTTP client
 httpx>=0.25.0
 # Image handling (already included with fastmcp, but listed for clarity)
 Pillow>=10.0.0
 # RTSP/video stream support
 opencv-python>=4.8.0
--- a/mcp/vision_mcp.py
+++ b/mcp/vision_mcp.py
@@ -0,0 +1,436 @@
 #!/usr/bin/env python3
 """
 Vision MCP Server
 Model Context Protocol server for interacting with multiple camera-server instances
 and RTSP streams.
 Tools:
 - vision_get_cams() - Get list of active cameras
 - vision_snap(cam_id) - Get snapshot from a camera (HTTP API or RTSP)
 """
 import json
 import logging
 from pathlib import Path
 from typing import List, Dict, Any, Union
 from io import BytesIO
 import httpx
 import cv2
 import numpy as np
 from PIL import Image
 from fastmcp import FastMCP
 from fastmcp.utilities.types import Image as MCPImage
 # Configuration
 CONFIG_FILE = Path.home() / ".vision_setup.json"
 LOG_FILE = Path("/tmp/vision_mcp.log")
 REQUEST_TIMEOUT = 5.0  # seconds
 RTSP_TIMEOUT = 10.0  # seconds for RTSP stream connection
 # Setup logging
 logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler(LOG_FILE),
        logging.StreamHandler()
    ]
 )
 logger = logging.getLogger(__name__)
 # Initialize MCP server
 mcp = FastMCP("Vision Camera System")
 def load_camera_config() -> Dict[str, Any]:
    """
    Load camera configuration from ~/.vision_setup.json
    Returns:
        Dictionary with camera configurations
    Raises:
        FileNotFoundError: If config file doesn't exist
        ValueError: If config file is invalid
    """
    if not CONFIG_FILE.exists():
        raise FileNotFoundError(
            f"Camera config file not found: {CONFIG_FILE}\n"
            f"Create {CONFIG_FILE} with camera configurations."
        )
    try:
        with open(CONFIG_FILE, 'r') as f:
            config = json.load(f)
        if 'cameras' not in config:
            raise ValueError("Config file must contain 'cameras' array")
        # Validate each camera config
        for cam in config['cameras']:
            # All cameras need 'id' and 'type'
            if 'id' not in cam:
                raise ValueError("Camera config missing 'id' field")
            cam_type = cam.get('type', 'http')  # Default to http for backward compatibility
            if cam_type == 'http':
                # HTTP cameras need url and api_key
                required_fields = ['url', 'api_key']
                missing = [f for f in required_fields if f not in cam]
                if missing:
                    raise ValueError(
                        f"HTTP camera '{cam['id']}' missing required fields: {missing}"
                    )
            elif cam_type == 'rtsp':
                # RTSP cameras need rtsp_url
                if 'rtsp_url' not in cam:
                    raise ValueError(
                        f"RTSP camera '{cam['id']}' missing required field: rtsp_url"
                    )
            else:
                raise ValueError(
                    f"Camera '{cam['id']}' has invalid type: {cam_type}. "
                    f"Must be 'http' or 'rtsp'"
                )
        logger.info(f"Loaded {len(config['cameras'])} camera(s) from config")
        return config
    except json.JSONDecodeError as e:
        raise ValueError(f"Invalid JSON in config file: {e}")
 def get_camera_by_id(cam_id: str) -> Dict[str, str]:
    """
    Get camera configuration by ID
    Args:
        cam_id: Camera ID string
    Returns:
        Camera configuration dict
    Raises:
        ValueError: If camera ID not found
    """
    config = load_camera_config()
    for cam in config['cameras']:
        if cam['id'] == cam_id:
            return cam
    available_ids = [c['id'] for c in config['cameras']]
    raise ValueError(
        f"Camera '{cam_id}' not found in config.\n"
        f"Available cameras: {', '.join(available_ids)}"
    )
 def capture_rtsp_snapshot(rtsp_url: str, timeout: float = RTSP_TIMEOUT) -> bytes:
    """
    Capture a single frame from an RTSP stream
    Args:
        rtsp_url: RTSP stream URL (e.g., rtsp://192.168.1.239/live)
        timeout: Connection timeout in seconds
    Returns:
        JPEG image bytes
    Raises:
        RuntimeError: If unable to connect or capture frame
    """
    logger.info(f"Attempting to capture from RTSP: {rtsp_url}")
    # Create video capture object
    cap = cv2.VideoCapture(rtsp_url)
    # Set timeout (in milliseconds)
    cap.set(cv2.CAP_PROP_OPEN_TIMEOUT_MSEC, int(timeout * 1000))
    try:
        # Check if stream opened successfully
        if not cap.isOpened():
            raise RuntimeError(f"Failed to open RTSP stream: {rtsp_url}")
        # Read a frame
        ret, frame = cap.read()
        if not ret or frame is None:
            raise RuntimeError(f"Failed to read frame from RTSP stream: {rtsp_url}")
        # Convert BGR (OpenCV) to RGB (PIL)
        frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        # Convert to PIL Image
        pil_image = Image.fromarray(frame_rgb)
        # Convert to JPEG bytes
        buffer = BytesIO()
        pil_image.save(buffer, format='JPEG', quality=90)
        jpeg_bytes = buffer.getvalue()
        logger.info(f"✓ Captured RTSP snapshot ({len(jpeg_bytes)} bytes)")
        return jpeg_bytes
    finally:
        # Always release the capture
        cap.release()
@mcp.tool()
 async def vision_get_cams() -> List[Dict[str, str]]:
    """
    Get list of all configured cameras with their online/offline status.
    Queries the /health endpoint of each camera to determine if it's online.
    Returns:
        List of camera info dictionaries:
        [
            {
                "id": "basement",
                "status": "online"  # or "offline"
            },
            ...
        ]
    Examples:
        vision_get_cams()
    """
    try:
        config = load_camera_config()
        cameras = []
        async with httpx.AsyncClient(timeout=REQUEST_TIMEOUT, verify=False) as client:
            for cam in config['cameras']:
                cam_type = cam.get('type', 'http')
                cam_info = {
                    "id": cam['id'],
                    "type": cam_type,
                    "status": "unknown"
                }
                # Check status based on camera type
                try:
                    if cam_type == 'http':
                        # Check HTTP health endpoint
                        health_url = f"{cam['url'].rstrip('/')}/health"
                        logger.debug(f"Checking HTTP health: {health_url}")
                        response = await client.get(health_url)
                        if response.status_code == 200:
                            cam_info['status'] = 'online'
                            logger.info(f"Camera '{cam['id']}' is online")
                        else:
                            cam_info['status'] = 'offline'
                            logger.warning(f"Camera '{cam['id']}' returned status {response.status_code}")
                    elif cam_type == 'rtsp':
                        # Try to briefly connect to RTSP stream
                        rtsp_url = cam['rtsp_url']
                        logger.debug(f"Checking RTSP stream: {rtsp_url}")
                        cap = cv2.VideoCapture(rtsp_url)
                        cap.set(cv2.CAP_PROP_OPEN_TIMEOUT_MSEC, 3000)  # 3 second timeout
                        if cap.isOpened():
                            cam_info['status'] = 'online'
                            logger.info(f"RTSP camera '{cam['id']}' is online")
                        else:
                            cam_info['status'] = 'offline'
                            logger.warning(f"RTSP camera '{cam['id']}' connection failed")
                        cap.release()
                except httpx.TimeoutException:
                    cam_info['status'] = 'offline'
                    logger.warning(f"Camera '{cam['id']}' timed out")
                except httpx.ConnectError:
                    cam_info['status'] = 'offline'
                    logger.warning(f"Camera '{cam['id']}' connection failed")
                except Exception as e:
                    cam_info['status'] = 'offline'
                    logger.error(f"Camera '{cam['id']}' error: {e}")
                cameras.append(cam_info)
        logger.info(f"Found {len(cameras)} camera(s), {sum(1 for c in cameras if c['status'] == 'online')} online")
        return cameras
    except FileNotFoundError as e:
        logger.error(f"Config error: {e}")
        return [{"error": str(e)}]
    except ValueError as e:
        logger.error(f"Config error: {e}")
        return [{"error": str(e)}]
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        return [{"error": f"Unexpected error: {str(e)}"}]
@mcp.tool()
 async def vision_snap(cam_id: str) -> Union[MCPImage, str]:
    """
    Get a snapshot from a camera.
    Queries the /snapshot endpoint and returns the image for inline display.
    Args:
        cam_id: Camera ID from config file (e.g., "basement")
    Returns:
        MCPImage object for inline display, or error message string
    Examples:
        vision_snap("basement")
    """
    try:
        # Get camera config
        cam = get_camera_by_id(cam_id)
        cam_type = cam.get('type', 'http')
        # Handle based on camera type
        if cam_type == 'http':
            # HTTP API camera
            async with httpx.AsyncClient(timeout=REQUEST_TIMEOUT, verify=False) as client:
                snapshot_url = f"{cam['url'].rstrip('/')}/snapshot"
                headers = {"X-API-Key": cam['api_key']}
                logger.info(f"Requesting HTTP snapshot from '{cam_id}' at {snapshot_url}")
                try:
                    response = await client.get(snapshot_url, headers=headers)
                    if response.status_code == 200:
                        # Check content type
                        content_type = response.headers.get('content-type', '')
                        if 'image' not in content_type:
                            logger.warning(f"Unexpected content type: {content_type}")
                        # Get image bytes
                        image_bytes = response.content
                        logger.info(f"✓ Snapshot received from '{cam_id}' ({len(image_bytes)} bytes)")
                        # Return as MCPImage (directly, not in dict)
                        return MCPImage(data=image_bytes, format="jpeg")
                    elif response.status_code == 403:
                        error_msg = f"❌ Authentication failed for camera '{cam_id}'. Check API key in config."
                        logger.error(error_msg)
                        return error_msg
                    elif response.status_code == 503:
                        error_msg = f"❌ Camera '{cam_id}' is unavailable (503). Camera may be disconnected."
                        logger.error(error_msg)
                        return error_msg
                    else:
                        error_msg = f"❌ Camera '{cam_id}' returned status {response.status_code}: {response.text[:100]}"
                        logger.error(error_msg)
                        return error_msg
                except httpx.TimeoutException:
                    error_msg = f"❌ Camera '{cam_id}' timed out after {REQUEST_TIMEOUT}s"
                    logger.error(error_msg)
                    return error_msg
                except httpx.ConnectError as e:
                    error_msg = f"❌ Cannot connect to camera '{cam_id}' at {cam['url']}: {str(e)}"
                    logger.error(error_msg)
                    return error_msg
        elif cam_type == 'rtsp':
            # RTSP stream camera
            rtsp_url = cam['rtsp_url']
            logger.info(f"Capturing RTSP snapshot from '{cam_id}' at {rtsp_url}")
            try:
                # Capture snapshot from RTSP stream
                image_bytes = capture_rtsp_snapshot(rtsp_url)
                logger.info(f"✓ RTSP snapshot captured from '{cam_id}' ({len(image_bytes)} bytes)")
                # Return as MCPImage
                return MCPImage(data=image_bytes, format="jpeg")
            except RuntimeError as e:
                error_msg = f"❌ Failed to capture from RTSP camera '{cam_id}': {str(e)}"
                logger.error(error_msg)
                return error_msg
        else:
            error_msg = f"❌ Unknown camera type '{cam_type}' for camera '{cam_id}'"
            logger.error(error_msg)
            return error_msg
    except ValueError as e:
        # Camera ID not found
        logger.error(f"Camera lookup error: {e}")
        return f"❌ {str(e)}"
    except FileNotFoundError as e:
        # Config file not found
        logger.error(f"Config error: {e}")
        return f"❌ {str(e)}"
    except Exception as e:
        error_msg = f"❌ Unexpected error getting snapshot from '{cam_id}': {str(e)}"
        logger.exception(error_msg)
        return error_msg
@mcp.tool()
 def vision_get_info() -> str:
    """
    Get information about the Vision camera system configuration.
    Returns details about configured cameras and config file location.
    Returns:
        Formatted string with system info
    """
    try:
        config = load_camera_config()
        cameras = config['cameras']
        info_lines = [
            "Vision Camera System",
            "",
            f"Config file: {CONFIG_FILE}",
            f"Cameras configured: {len(cameras)}",
            ""
        ]
        for cam in cameras:
            cam_type = cam.get('type', 'http')
            if cam_type == 'http':
                info_lines.append(f"  • {cam['id']} (HTTP): {cam['url']}")
            elif cam_type == 'rtsp':
                info_lines.append(f"  • {cam['id']} (RTSP): {cam['rtsp_url']}")
        info_lines.append("")
        info_lines.append("Use vision_get_cams() to check camera status")
        info_lines.append("Use vision_snap(cam_id) to get a snapshot")
        return "\n".join(info_lines)
    except FileNotFoundError as e:
        return f"❌ {str(e)}"
    except ValueError as e:
        return f"❌ {str(e)}"
    except Exception as e:
        return f"❌ Unexpected error: {str(e)}"
 if __name__ == "__main__":
    # Run the MCP server (uses stdio transport by default)
    mcp.run()
--- a/server/.gitignore
+++ b/server/.gitignore
@@ -0,0 +1,52 @@
 # Environment variables (contains API key!)
 .env
 # SSL certificates
 ssl/
 *.pem
 *.key
 *.crt
 # Python
 __pycache__/
 *.py[cod]
 *$py.class
 *.so
 .Python
 venv/
 env/
 ENV/
 build/
 develop-eggs/
 dist/
 downloads/
 eggs/
 .eggs/
 lib/
 lib64/
 parts/
 sdist/
 var/
 wheels/
 *.egg-info/
 .installed.cfg
 *.egg
 # IDE
 .vscode/
 .idea/
 *.swp
 *.swo
 *~
 # OS
 .DS_Store
 Thumbs.db
 # Logs
 *.log
 # Test snapshots
 *.jpg
 *.jpeg
 *.png
--- a/server/generate_cert.sh
+++ b/server/generate_cert.sh
@@ -0,0 +1,48 @@
 #!/bin/bash
 #
 # Generate self-signed SSL certificate for local HTTPS
 #
 # This creates a certificate valid for 365 days. While browsers will show
 # a warning (since it's self-signed), the connection will still be encrypted.
 #
 set -e
 CERT_DIR="ssl"
 CERT_FILE="$CERT_DIR/cert.pem"
 KEY_FILE="$CERT_DIR/key.pem"
 echo "=== Camera Server SSL Certificate Generator ==="
 echo
 # Create ssl directory if it doesn't exist
 mkdir -p "$CERT_DIR"
 # Generate self-signed certificate
 echo "Generating self-signed certificate..."
 openssl req -x509 -newkey rsa:4096 \
    -keyout "$KEY_FILE" \
    -out "$CERT_FILE" \
    -days 365 \
    -nodes \
    -subj "/C=US/ST=State/L=City/O=CameraServer/CN=camera.local"
 # Set proper permissions
 chmod 600 "$KEY_FILE"
 chmod 644 "$CERT_FILE"
 echo
 echo "✓ Certificate generated successfully!"
 echo
 echo "Files created:"
 echo "  - Certificate: $CERT_FILE"
 echo "  - Private key: $KEY_FILE"
 echo
 echo "Note: Browsers will show a security warning because this is self-signed."
 echo "This is normal for local development. The connection is still encrypted."
 echo
 echo "To trust this certificate:"
 echo "  - On macOS: Open Keychain Access, import cert.pem, mark as trusted"
 echo "  - On Linux: Copy to /usr/local/share/ca-certificates/ and run update-ca-certificates"
 echo "  - On Windows: Import cert.pem into Trusted Root Certification Authorities"
 echo
--- a/server/main.py
+++ b/server/main.py
@@ -0,0 +1,220 @@
 #!/usr/bin/env python3
 """
 Camera Snapshot Server
 Simple FastAPI server that serves snapshots from a USB camera.
 Features:
 - API key authentication
 - HTTPS support
 - Thread-safe camera access
 - Auto-reconnect on camera failure
 """
 import os
 import cv2
 import threading
 import secrets
 from typing import Optional
 from dotenv import load_dotenv
 from fastapi import FastAPI, Security, HTTPException, Response
 from fastapi.security import APIKeyHeader
 from fastapi.responses import JSONResponse
 # Load environment variables
 load_dotenv()
 # Configuration
 API_KEY = os.getenv("API_KEY")
 CAMERA_INDEX = int(os.getenv("CAMERA_INDEX", "0"))
 CAMERA_WIDTH = int(os.getenv("CAMERA_WIDTH", "1920"))
 CAMERA_HEIGHT = int(os.getenv("CAMERA_HEIGHT", "1080"))
 JPEG_QUALITY = int(os.getenv("JPEG_QUALITY", "85"))
 if not API_KEY:
    raise ValueError("API_KEY not set in .env file. Generate one with: python3 -c 'import secrets; print(secrets.token_urlsafe(32))'")
 # FastAPI app
 app = FastAPI(
    title="Camera Snapshot Server",
    description="Serves snapshots from USB camera with API key authentication",
    version="1.0.0"
 )
 # API Key authentication
 api_key_header = APIKeyHeader(name="X-API-Key", auto_error=False)
 class CameraManager:
    """Thread-safe camera manager with auto-reconnect"""
    def __init__(self, camera_index: int = 0, width: int = 1920, height: int = 1080):
        self.camera_index = camera_index
        self.width = width
        self.height = height
        self.camera: Optional[cv2.VideoCapture] = None
        self.lock = threading.Lock()
    def _open_camera(self) -> bool:
        """Open camera connection"""
        try:
            self.camera = cv2.VideoCapture(self.camera_index)
            if not self.camera.isOpened():
                return False
            # Set camera resolution
            self.camera.set(cv2.CAP_PROP_FRAME_WIDTH, self.width)
            self.camera.set(cv2.CAP_PROP_FRAME_HEIGHT, self.height)
            # Set camera properties for better performance
            self.camera.set(cv2.CAP_PROP_BUFFERSIZE, 1)  # Reduce buffer to get latest frame
            # Log actual resolution (camera may not support requested resolution)
            actual_width = int(self.camera.get(cv2.CAP_PROP_FRAME_WIDTH))
            actual_height = int(self.camera.get(cv2.CAP_PROP_FRAME_HEIGHT))
            print(f"Camera resolution: {actual_width}x{actual_height} (requested: {self.width}x{self.height})")
            return True
        except Exception as e:
            print(f"Error opening camera: {e}")
            return False
    def get_snapshot(self) -> Optional[bytes]:
        """
        Capture a snapshot from the camera.
        Returns:
            JPEG-encoded image bytes, or None if failed
        """
        with self.lock:
            # Open camera if not initialized or closed
            if self.camera is None or not self.camera.isOpened():
                if not self._open_camera():
                    return None
            # Flush buffer to get latest frame
            # Read and discard several frames to clear old buffered frames
            for _ in range(5):
                self.camera.grab()
            # Capture the latest frame
            ret, frame = self.camera.read()
            # Retry on failure
            if not ret:
                print("Failed to capture frame, attempting reconnect...")
                self.release()
                if not self._open_camera():
                    return None
                # Flush buffer again after reconnect
                for _ in range(5):
                    self.camera.grab()
                ret, frame = self.camera.read()
                if not ret:
                    return None
            # Encode as JPEG
            try:
                ret, buffer = cv2.imencode(
                    '.jpg',
                    frame,
                    [cv2.IMWRITE_JPEG_QUALITY, JPEG_QUALITY]
                )
                if not ret:
                    return None
                return buffer.tobytes()
            except Exception as e:
                print(f"Error encoding image: {e}")
                return None
    def release(self):
        """Release camera resources"""
        if self.camera is not None:
            self.camera.release()
            self.camera = None
    def __del__(self):
        """Cleanup on deletion"""
        self.release()
 # Global camera manager
 camera_manager = CameraManager(CAMERA_INDEX, CAMERA_WIDTH, CAMERA_HEIGHT)
 def verify_api_key(api_key: str = Security(api_key_header)) -> str:
    """Verify API key from header"""
    if api_key is None or api_key != API_KEY:
        raise HTTPException(
            status_code=403,
            detail="Invalid or missing API key"
        )
    return api_key
@app.get("/")
 def root():
    """Root endpoint with API info"""
    return {
        "service": "Camera Snapshot Server",
        "version": "1.0.0",
        "endpoints": {
            "/snapshot": "GET - Returns JPEG snapshot (requires X-API-Key header)",
            "/health": "GET - Health check (no auth required)"
        }
    }
@app.get("/health")
 def health():
    """Health check endpoint"""
    return {"status": "ok"}
@app.get("/snapshot")
 def get_snapshot(api_key: str = Security(verify_api_key)):
    """
    Get a snapshot from the USB camera.
    Requires X-API-Key header for authentication.
    Returns:
        JPEG image
    """
    snapshot = camera_manager.get_snapshot()
    if snapshot is None:
        raise HTTPException(
            status_code=503,
            detail="Failed to capture snapshot. Check camera connection."
        )
    return Response(
        content=snapshot,
        media_type="image/jpeg",
        headers={
            "Cache-Control": "no-cache, no-store, must-revalidate",
            "Pragma": "no-cache",
            "Expires": "0"
        }
    )
@app.on_event("shutdown")
 def shutdown_event():
    """Cleanup on shutdown"""
    camera_manager.release()
 if __name__ == "__main__":
    import uvicorn
    # For development only - use uvicorn command for production
    uvicorn.run(
        "main:app",
        host="0.0.0.0",
        port=8443,
        ssl_keyfile="ssl/key.pem",
        ssl_certfile="ssl/cert.pem"
    )
--- a/server/requirements.txt
+++ b/server/requirements.txt
@@ -0,0 +1,11 @@
 # Camera Snapshot Server Dependencies
 # Web framework
 fastapi>=0.104.0
 uvicorn[standard]>=0.24.0
 # Camera access
 opencv-python-headless>=4.8.0
 # Configuration
 python-dotenv>=1.0.0
--- a/server/setup.sh
+++ b/server/setup.sh
@@ -0,0 +1,157 @@
 #!/bin/bash
 # vixy-vision Server Setup Script
 # Run this on a Raspberry Pi or similar edge device
 # 
 # Usage: ./setup.sh [--with-audio]
 set -e
 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 INSTALL_DIR="${HOME}/vixy-vision"
 SERVICE_NAME="vixy-vision"
 # Colors for output
 RED='\033[0;31m'
 GREEN='\033[0;32m'
 YELLOW='\033[1;33m'
 NC='\033[0m' # No Color
 echo_info() { echo -e "${GREEN}[INFO]${NC} $1"; }
 echo_warn() { echo -e "${YELLOW}[WARN]${NC} $1"; }
 echo_error() { echo -e "${RED}[ERROR]${NC} $1"; }
 # Parse arguments
 WITH_AUDIO=false
 for arg in "$@"; do
    case $arg in
        --with-audio)
            WITH_AUDIO=true
            shift
            ;;
    esac
 done
 echo "=========================================="
 echo "  vixy-vision Server Setup"
 echo "  Eyes and ears for the fox 🦊"
 echo "=========================================="
 echo ""
 # Check if running on Linux
 if [[ "$(uname)" != "Linux" ]]; then
    echo_error "This script is designed for Linux (Raspberry Pi)"
    exit 1
 fi
 # Install system dependencies
 echo_info "Installing system dependencies..."
 sudo apt-get update
 sudo apt-get install -y python3-pip python3-venv libopencv-dev
 if [ "$WITH_AUDIO" = true ]; then
    echo_info "Installing audio dependencies..."
    sudo apt-get install -y portaudio19-dev python3-pyaudio alsa-utils
 fi
 # Create install directory
 echo_info "Creating install directory: ${INSTALL_DIR}"
 mkdir -p "${INSTALL_DIR}"
 cp -r "${SCRIPT_DIR}"/* "${INSTALL_DIR}/"
 # Create virtual environment
 echo_info "Creating Python virtual environment..."
 cd "${INSTALL_DIR}"
 python3 -m venv venv
 source venv/bin/activate
 # Install Python dependencies
 echo_info "Installing Python dependencies..."
 pip install --upgrade pip
 pip install -r requirements.txt
 if [ "$WITH_AUDIO" = true ]; then
    pip install pyaudio webrtcvad numpy
 fi
 # Generate SSL certificates
 echo_info "Generating SSL certificates..."
 chmod +x generate_cert.sh
 ./generate_cert.sh
 # Generate API key if .env doesn't exist
 if [ ! -f .env ]; then
    echo_info "Generating API key..."
    API_KEY=$(python3 -c 'import secrets; print(secrets.token_urlsafe(32))')
    cat > .env << EOF
 # vixy-vision Server Configuration
 # Generated by setup.sh on $(date)
 # API Key for authentication (keep secret!)
 API_KEY=${API_KEY}
 # Camera settings
 CAMERA_INDEX=0
 CAMERA_WIDTH=1920
 CAMERA_HEIGHT=1080
 JPEG_QUALITY=85
 EOF
    echo_info "API key generated and saved to .env"
    echo ""
    echo_warn "IMPORTANT: Save this API key for your MCP config:"
    echo -e "  ${GREEN}${API_KEY}${NC}"
    echo ""
 else
    echo_info "Using existing .env file"
 fi
 # Create systemd service
 echo_info "Creating systemd service..."
 sudo tee /etc/systemd/system/${SERVICE_NAME}.service > /dev/null << EOF
 [Unit]
 Description=vixy-vision Camera Server
 After=network.target
 [Service]
 Type=simple
 User=${USER}
 WorkingDirectory=${INSTALL_DIR}
 Environment="PATH=${INSTALL_DIR}/venv/bin"
 ExecStart=${INSTALL_DIR}/venv/bin/uvicorn main:app --host 0.0.0.0 --port 8443 --ssl-keyfile ssl/key.pem --ssl-certfile ssl/cert.pem
 Restart=always
 RestartSec=10
 [Install]
 WantedBy=multi-user.target
 EOF
 # Reload systemd and enable service
 sudo systemctl daemon-reload
 sudo systemctl enable ${SERVICE_NAME}
 echo ""
 echo "=========================================="
 echo "  Setup Complete! 🦊"
 echo "=========================================="
 echo ""
 echo "Commands:"
 echo "  Start:   sudo systemctl start ${SERVICE_NAME}"
 echo "  Stop:    sudo systemctl stop ${SERVICE_NAME}"
 echo "  Status:  sudo systemctl status ${SERVICE_NAME}"
 echo "  Logs:    sudo journalctl -u ${SERVICE_NAME} -f"
 echo ""
 echo "Server will be available at:"
 echo "  https://$(hostname -I | awk '{print $1}'):8443/"
 echo ""
 echo "Add to Vixy's vision config (~/.vision_setup.json):"
 echo "  {"
 echo "    \"cameras\": ["
 echo "      {"
 echo "        \"id\": \"$(hostname)\","
 echo "        \"type\": \"http\","
 echo "        \"url\": \"https://$(hostname -I | awk '{print $1}'):8443\","
 echo "        \"api_key\": \"<your-api-key-from-above>\""
 echo "      }"
 echo "    ]"
 echo "  }"
 echo ""
 echo_info "Start the server with: sudo systemctl start ${SERVICE_NAME}"
--- a/shared/README.md
+++ b/shared/README.md
@@ -0,0 +1,18 @@
 # Shared Module
 Common schemas and interfaces used across vixy-vision.
 ## Planned Components
 ### events.py
 Event schema definitions and queue interface.
 ```python
@dataclass
 class SensorEvent:
    timestamp: datetime
    source_id: str      # camera/mic ID
    event_type: str     # "motion", "audio", "speech"
    confidence: float   # 0.0 - 1.0
    metadata: dict      # type-specific data
 ```