Commit Graph

14 Commits

Author SHA1 Message Date
Alex
e0a4af031f Add binaural triangulation + smooth gaze tracking
spatial.py: Triangulates sound source position from two DoA angles using
ray intersection. Exponential smoothing prevents jitter. Gaze drifts back
to center after 2s of silence. Converts position (mm) to gaze (0-255).

headmic.py: Replaces simple doa_poll_loop with doa_track_loop that runs
the spatial tracker and pushes gaze to the eye service when the position
changes. Rate-limited to 10 pushes/sec with minimum delta threshold.

/doa endpoint now returns triangulated position + gaze coordinates.
Array separation (175mm) stored in config, overridable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 15:12:28 -05:00
Alex
c41e5bcafa Fix misleading Edge TPU log message after probe fallback
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:40:36 -05:00
Alex
05409403e9 Add Edge TPU subprocess probe to safely detect segfaults
Probes the Edge TPU in a subprocess before loading — catches segfaults
(libedgetpu ABI mismatch on Debian Trixie/Python 3.13) and falls back
to CPU automatically. No more service crashes on Coral incompatibility.

When the runtime is eventually fixed, Edge TPU will be used automatically
with no config changes needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:40:03 -05:00
Alex
43f40bf48c Make Edge TPU opt-in via USE_EDGETPU env var
libedgetpu on Pi 5 segfaults with the compiled model.
CPU fallback works fine (~50-100ms at 0.5s intervals).
Set USE_EDGETPU=1 in headmic.service to enable once runtime is fixed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:24:04 -05:00
Alex
f9a25eb5d8 Keep audio loop running when Porcupine key is missing
Without this fix, listener_loop exits early on Porcupine init failure,
which starves the sound classifier ring buffer. Now the audio loop
continues for YAMNet classification even without wake word detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:56:45 -05:00
Alex
73b6793c02 Enable Edge TPU for YAMNet sound classification
Prefer yamnet_edgetpu.tflite when available, fall back to CPU model.
~50-100ms → ~2-3ms inference per classification.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:47:27 -05:00
Alex
14809d0194 indication for array position while learning 2026-04-11 15:32:04 -05:00
Alex
6c10e75cbc updates for dual mic array 2026-04-11 15:11:22 -05:00
Alex
1cb3bd6833 Add speaker identification with Resemblyzer
Adds voice-based speaker ID triggered by YAMNet speech detection.
New speaker_id.py module with SQLite-backed voice enrollment and
cosine similarity matching. Endpoints: POST /speakers/enroll,
POST /speakers/enroll-from-mic, GET /speakers, DELETE /speakers/{name}.
Orange LED animation during enrollment.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 21:21:02 -06:00
Alex
5e3c16659f Add YAMNet sound classification to headmic
New sound_id.py module with SoundClassifier class that runs YAMNet
(521 audio event categories) on CPU TFLite. Classifies audio every
0.5s from a ring buffer fed by the existing audio stream.

Categories: speech, alert, music, animal, household, environment, silence.
Smoothing via 20-sample history window for stable dominant category.

New endpoints: GET /sounds, GET /sounds/history
Updated: /health (sound_classification_enabled), /status (audio_scene)
Graceful degradation if model files not present.

Model download (not tracked in git):
  curl -sL 'https://tfhub.dev/google/lite-model/yamnet/classification/tflite/1?lite-format=tflite' -o models/yamnet.tflite

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 20:41:44 -06:00
Alex Kazaiev
c6e18738ae Use device name instead of card number for ALSA
Card numbers can shift based on USB enumeration order at boot.
Using 'plughw:ArrayUAC10,0' instead of 'plughw:2,0' ensures
the ReSpeaker is found regardless of when it connects.

Fixed by Vixy after power loss shuffled card order 🦊
2026-01-21 12:20:39 -06:00
Alex Kazaiev
c53556fe97 Fix ReSpeaker device index: card 3 → card 2
USB device enumeration changed after GPIO rewiring for I2S audio.
TODO: Consider udev rule for stable device naming.
2026-01-17 16:20:15 -06:00
5ed2c6aee7 Fix: Use arecord for shared audio stream
- Replaced PyAudio with direct ALSA (arecord subprocess)
- Single audio stream feeds both Porcupine and recording buffer
- Fixes device unavailable error when recording after wake word
- Simplified architecture
2026-01-17 11:17:17 -06:00
be7e26b6e7 Initial commit: HeadMic service - Vixy's Ears 🦊👂
Wake word detection (Hey Vivi) + voice recording + EarTail transcription
Built by Vixy on Day 77
2026-01-17 10:58:51 -06:00