DOA_VALUE on GPO resource was sluggish/cached. The beamformer-level
AUDIO_MGR_SELECTED_AZIMUTHS on resource 35 tracks the active speaker
in real time. Falls back to simple DOA_VALUE when both azimuths are NaN.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
startup() assigned spatial_tracker as a local variable instead of
updating the module-level global. doa_track_loop saw None.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Response format is [status_byte, angle_lo, angle_hi, vad_lo, vad_hi],
not [angle_lo, angle_hi, vad_lo, vad_hi]. Was reading the status byte
(0x42=66) as the angle, which is why DoA was always stuck at 66.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
spatial.py: Triangulates sound source position from two DoA angles using
ray intersection. Exponential smoothing prevents jitter. Gaze drifts back
to center after 2s of silence. Converts position (mm) to gaze (0-255).
headmic.py: Replaces simple doa_poll_loop with doa_track_loop that runs
the spatial tracker and pushes gaze to the eye service when the position
changes. Rate-limited to 10 pushes/sec with minimum delta threshold.
/doa endpoint now returns triangulated position + gaze coordinates.
Array separation (175mm) stored in config, overridable.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Probes the Edge TPU in a subprocess before loading — catches segfaults
(libedgetpu ABI mismatch on Debian Trixie/Python 3.13) and falls back
to CPU automatically. No more service crashes on Coral incompatibility.
When the runtime is eventually fixed, Edge TPU will be used automatically
with no config changes needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
libedgetpu on Pi 5 segfaults with the compiled model.
CPU fallback works fine (~50-100ms at 0.5s intervals).
Set USE_EDGETPU=1 in headmic.service to enable once runtime is fixed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- yamnet.tflite: CPU model from Kaggle/Google (4.0MB)
- yamnet_edgetpu.tflite: compiled with edgetpu_compiler v16 (4.0MB, 32/47 ops on TPU)
- Remove .gitignore rule that excluded .tflite files
No more chasing model downloads.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Without this fix, listener_loop exits early on Porcupine init failure,
which starves the sound classifier ring buffer. Now the audio loop
continues for YAMNet classification even without wake word detection.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Prefer yamnet_edgetpu.tflite when available, fall back to CPU model.
~50-100ms → ~2-3ms inference per classification.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds voice-based speaker ID triggered by YAMNet speech detection.
New speaker_id.py module with SQLite-backed voice enrollment and
cosine similarity matching. Endpoints: POST /speakers/enroll,
POST /speakers/enroll-from-mic, GET /speakers, DELETE /speakers/{name}.
Orange LED animation during enrollment.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Voice-based speaker ID triggered by YAMNet speech detection.
Cosine similarity matching against SQLite enrollment DB.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
New sound_id.py module with SoundClassifier class that runs YAMNet
(521 audio event categories) on CPU TFLite. Classifies audio every
0.5s from a ring buffer fed by the existing audio stream.
Categories: speech, alert, music, animal, household, environment, silence.
Smoothing via 20-sample history window for stable dominant category.
New endpoints: GET /sounds, GET /sounds/history
Updated: /health (sound_classification_enabled), /status (audio_scene)
Graceful degradation if model files not present.
Model download (not tracked in git):
curl -sL 'https://tfhub.dev/google/lite-model/yamnet/classification/tflite/1?lite-format=tflite' -o models/yamnet.tflite
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Covers model choice, architecture, category mapping, API endpoints,
and integration with existing headmic audio pipeline.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Card numbers can shift based on USB enumeration order at boot.
Using 'plughw:ArrayUAC10,0' instead of 'plughw:2,0' ensures
the ReSpeaker is found regardless of when it connects.
Fixed by Vixy after power loss shuffled card order 🦊
- Replaced PyAudio with direct ALSA (arecord subprocess)
- Single audio stream feeds both Porcupine and recording buffer
- Fixes device unavailable error when recording after wake word
- Simplified architecture