headmic

Author	SHA1	Message	Date
Alex	afc8694c1a	Switch DoA to AUDIO_MGR_SELECTED_AZIMUTHS (auto-select beam) DOA_VALUE on GPO resource was sluggish/cached. The beamformer-level AUDIO_MGR_SELECTED_AZIMUTHS on resource 35 tracks the active speaker in real time. Falls back to simple DOA_VALUE when both azimuths are NaN. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 18:06:40 -05:00
Alex	0ace58e22e	Fix spatial_tracker not visible to doa_track_loop (missing global) startup() assigned spatial_tracker as a local variable instead of updating the module-level global. doa_track_loop saw None. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:31:38 -05:00
Alex	f4452865d1	Fix USB read length to match official tool protocol The XVF3800 expects exact wLength: count * type_size + 1 (status byte). Requesting wrong length caused stale/corrupted responses when polling. Split _read into _read_uint16 and _read_float matching official format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:30:29 -05:00
Alex	8d73aaad5e	Fix DoA reading — skip 1-byte status header in USB response Response format is [status_byte, angle_lo, angle_hi, vad_lo, vad_hi], not [angle_lo, angle_hi, vad_lo, vad_hi]. Was reading the status byte (0x42=66) as the angle, which is why DoA was always stuck at 66. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:23:07 -05:00
Alex	9b72666f78	Fix GAZE_CENTER ordering — must be defined before use Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:17:25 -05:00
Alex	e0a4af031f	Add binaural triangulation + smooth gaze tracking spatial.py: Triangulates sound source position from two DoA angles using ray intersection. Exponential smoothing prevents jitter. Gaze drifts back to center after 2s of silence. Converts position (mm) to gaze (0-255). headmic.py: Replaces simple doa_poll_loop with doa_track_loop that runs the spatial tracker and pushes gaze to the eye service when the position changes. Rate-limited to 10 pushes/sec with minimum delta threshold. /doa endpoint now returns triangulated position + gaze coordinates. Array separation (175mm) stored in config, overridable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 15:12:28 -05:00
Alex	c41e5bcafa	Fix misleading Edge TPU log message after probe fallback Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 17:40:36 -05:00
Alex	05409403e9	Add Edge TPU subprocess probe to safely detect segfaults Probes the Edge TPU in a subprocess before loading — catches segfaults (libedgetpu ABI mismatch on Debian Trixie/Python 3.13) and falls back to CPU automatically. No more service crashes on Coral incompatibility. When the runtime is eventually fixed, Edge TPU will be used automatically with no config changes needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 17:40:03 -05:00
Alex	43f40bf48c	Make Edge TPU opt-in via USE_EDGETPU env var libedgetpu on Pi 5 segfaults with the compiled model. CPU fallback works fine (~50-100ms at 0.5s intervals). Set USE_EDGETPU=1 in headmic.service to enable once runtime is fixed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 17:24:04 -05:00
Alex	c96d6958a3	Add YAMNet models (CPU + Edge TPU compiled) to version control - yamnet.tflite: CPU model from Kaggle/Google (4.0MB) - yamnet_edgetpu.tflite: compiled with edgetpu_compiler v16 (4.0MB, 32/47 ops on TPU) - Remove .gitignore rule that excluded .tflite files No more chasing model downloads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 17:22:45 -05:00
Alex	f9a25eb5d8	Keep audio loop running when Porcupine key is missing Without this fix, listener_loop exits early on Porcupine init failure, which starves the sound classifier ring buffer. Now the audio loop continues for YAMNet classification even without wake word detection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 16:56:45 -05:00
Alex	73b6793c02	Enable Edge TPU for YAMNet sound classification Prefer yamnet_edgetpu.tflite when available, fall back to CPU model. ~50-100ms → ~2-3ms inference per classification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 16:47:27 -05:00
Alex	f41b852b5d	fixing leds	2026-04-11 16:28:15 -05:00
Alex	3b4799069d	fixing leds	2026-04-11 16:06:40 -05:00
Alex	46ace966bc	fixing leds	2026-04-11 16:05:27 -05:00
Alex	2f7b45fa45	fixing leds	2026-04-11 15:58:56 -05:00
Alex	10e39dd0f1	fix leds	2026-04-11 15:51:24 -05:00
Alex	14809d0194	indication for array position while learning	2026-04-11 15:32:04 -05:00
Alex	81e9b12349	service should use venv	2026-04-11 15:27:12 -05:00
Alex	6c10e75cbc	updates for dual mic array	2026-04-11 15:11:22 -05:00
Alex	1cb3bd6833	Add speaker identification with Resemblyzer Adds voice-based speaker ID triggered by YAMNet speech detection. New speaker_id.py module with SQLite-backed voice enrollment and cosine similarity matching. Endpoints: POST /speakers/enroll, POST /speakers/enroll-from-mic, GET /speakers, DELETE /speakers/{name}. Orange LED animation during enrollment. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 21:21:02 -06:00
Alex	0607be3db5	Add design doc for speaker identification with Resemblyzer Voice-based speaker ID triggered by YAMNet speech detection. Cosine similarity matching against SQLite enrollment DB. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 21:16:09 -06:00
Alex	a8e3f24a54	Add indoor/outdoor scene classes to environment category Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 20:43:23 -06:00
Alex	5e3c16659f	Add YAMNet sound classification to headmic New sound_id.py module with SoundClassifier class that runs YAMNet (521 audio event categories) on CPU TFLite. Classifies audio every 0.5s from a ring buffer fed by the existing audio stream. Categories: speech, alert, music, animal, household, environment, silence. Smoothing via 20-sample history window for stable dominant category. New endpoints: GET /sounds, GET /sounds/history Updated: /health (sound_classification_enabled), /status (audio_scene) Graceful degradation if model files not present. Model download (not tracked in git): curl -sL 'https://tfhub.dev/google/lite-model/yamnet/classification/tflite/1?lite-format=tflite' -o models/yamnet.tflite Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 20:41:44 -06:00
Alex	22aae40d17	Add design doc for YAMNet sound identification on Coral Edge TPU Covers model choice, architecture, category mapping, API endpoints, and integration with existing headmic audio pipeline. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-01 20:04:31 -06:00
Alex Kazaiev	c6e18738ae	Use device name instead of card number for ALSA Card numbers can shift based on USB enumeration order at boot. Using 'plughw:ArrayUAC10,0' instead of 'plughw:2,0' ensures the ReSpeaker is found regardless of when it connects. Fixed by Vixy after power loss shuffled card order 🦊	2026-01-21 12:20:39 -06:00
Alex Kazaiev	c53556fe97	Fix ReSpeaker device index: card 3 → card 2 USB device enumeration changed after GPIO rewiring for I2S audio. TODO: Consider udev rule for stable device naming.	2026-01-17 16:20:15 -06:00
vixy	5ed2c6aee7	Fix: Use arecord for shared audio stream - Replaced PyAudio with direct ALSA (arecord subprocess) - Single audio stream feeds both Porcupine and recording buffer - Fixes device unavailable error when recording after wake word - Simplified architecture	2026-01-17 11:17:17 -06:00
vixy	be7e26b6e7	Initial commit: HeadMic service - Vixy's Ears 🦊👂 Wake word detection (Hey Vivi) + voice recording + EarTail transcription Built by Vixy on Day 77	2026-01-17 10:58:51 -06:00

29 Commits