Commit Graph

21 Commits

Author SHA1 Message Date
Alex
43f40bf48c Make Edge TPU opt-in via USE_EDGETPU env var
libedgetpu on Pi 5 segfaults with the compiled model.
CPU fallback works fine (~50-100ms at 0.5s intervals).
Set USE_EDGETPU=1 in headmic.service to enable once runtime is fixed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:24:04 -05:00
Alex
c96d6958a3 Add YAMNet models (CPU + Edge TPU compiled) to version control
- yamnet.tflite: CPU model from Kaggle/Google (4.0MB)
- yamnet_edgetpu.tflite: compiled with edgetpu_compiler v16 (4.0MB, 32/47 ops on TPU)
- Remove .gitignore rule that excluded .tflite files

No more chasing model downloads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 17:22:45 -05:00
Alex
f9a25eb5d8 Keep audio loop running when Porcupine key is missing
Without this fix, listener_loop exits early on Porcupine init failure,
which starves the sound classifier ring buffer. Now the audio loop
continues for YAMNet classification even without wake word detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:56:45 -05:00
Alex
73b6793c02 Enable Edge TPU for YAMNet sound classification
Prefer yamnet_edgetpu.tflite when available, fall back to CPU model.
~50-100ms → ~2-3ms inference per classification.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 16:47:27 -05:00
Alex
f41b852b5d fixing leds 2026-04-11 16:28:15 -05:00
Alex
3b4799069d fixing leds 2026-04-11 16:06:40 -05:00
Alex
46ace966bc fixing leds 2026-04-11 16:05:27 -05:00
Alex
2f7b45fa45 fixing leds 2026-04-11 15:58:56 -05:00
Alex
10e39dd0f1 fix leds 2026-04-11 15:51:24 -05:00
Alex
14809d0194 indication for array position while learning 2026-04-11 15:32:04 -05:00
Alex
81e9b12349 service should use venv 2026-04-11 15:27:12 -05:00
Alex
6c10e75cbc updates for dual mic array 2026-04-11 15:11:22 -05:00
Alex
1cb3bd6833 Add speaker identification with Resemblyzer
Adds voice-based speaker ID triggered by YAMNet speech detection.
New speaker_id.py module with SQLite-backed voice enrollment and
cosine similarity matching. Endpoints: POST /speakers/enroll,
POST /speakers/enroll-from-mic, GET /speakers, DELETE /speakers/{name}.
Orange LED animation during enrollment.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 21:21:02 -06:00
Alex
0607be3db5 Add design doc for speaker identification with Resemblyzer
Voice-based speaker ID triggered by YAMNet speech detection.
Cosine similarity matching against SQLite enrollment DB.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 21:16:09 -06:00
Alex
a8e3f24a54 Add indoor/outdoor scene classes to environment category
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 20:43:23 -06:00
Alex
5e3c16659f Add YAMNet sound classification to headmic
New sound_id.py module with SoundClassifier class that runs YAMNet
(521 audio event categories) on CPU TFLite. Classifies audio every
0.5s from a ring buffer fed by the existing audio stream.

Categories: speech, alert, music, animal, household, environment, silence.
Smoothing via 20-sample history window for stable dominant category.

New endpoints: GET /sounds, GET /sounds/history
Updated: /health (sound_classification_enabled), /status (audio_scene)
Graceful degradation if model files not present.

Model download (not tracked in git):
  curl -sL 'https://tfhub.dev/google/lite-model/yamnet/classification/tflite/1?lite-format=tflite' -o models/yamnet.tflite

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 20:41:44 -06:00
Alex
22aae40d17 Add design doc for YAMNet sound identification on Coral Edge TPU
Covers model choice, architecture, category mapping, API endpoints,
and integration with existing headmic audio pipeline.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 20:04:31 -06:00
Alex Kazaiev
c6e18738ae Use device name instead of card number for ALSA
Card numbers can shift based on USB enumeration order at boot.
Using 'plughw:ArrayUAC10,0' instead of 'plughw:2,0' ensures
the ReSpeaker is found regardless of when it connects.

Fixed by Vixy after power loss shuffled card order 🦊
2026-01-21 12:20:39 -06:00
Alex Kazaiev
c53556fe97 Fix ReSpeaker device index: card 3 → card 2
USB device enumeration changed after GPIO rewiring for I2S audio.
TODO: Consider udev rule for stable device naming.
2026-01-17 16:20:15 -06:00
5ed2c6aee7 Fix: Use arecord for shared audio stream
- Replaced PyAudio with direct ALSA (arecord subprocess)
- Single audio stream feeds both Porcupine and recording buffer
- Fixes device unavailable error when recording after wake word
- Simplified architecture
2026-01-17 11:17:17 -06:00
be7e26b6e7 Initial commit: HeadMic service - Vixy's Ears 🦊👂
Wake word detection (Hey Vivi) + voice recording + EarTail transcription
Built by Vixy on Day 77
2026-01-17 10:58:51 -06:00