Commit Graph

13 Commits

Author SHA1 Message Date
Alex
10e39dd0f1 fix leds 2026-04-11 15:51:24 -05:00
Alex
14809d0194 indication for array position while learning 2026-04-11 15:32:04 -05:00
Alex
81e9b12349 service should use venv 2026-04-11 15:27:12 -05:00
Alex
6c10e75cbc updates for dual mic array 2026-04-11 15:11:22 -05:00
Alex
1cb3bd6833 Add speaker identification with Resemblyzer
Adds voice-based speaker ID triggered by YAMNet speech detection.
New speaker_id.py module with SQLite-backed voice enrollment and
cosine similarity matching. Endpoints: POST /speakers/enroll,
POST /speakers/enroll-from-mic, GET /speakers, DELETE /speakers/{name}.
Orange LED animation during enrollment.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 21:21:02 -06:00
Alex
0607be3db5 Add design doc for speaker identification with Resemblyzer
Voice-based speaker ID triggered by YAMNet speech detection.
Cosine similarity matching against SQLite enrollment DB.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 21:16:09 -06:00
Alex
a8e3f24a54 Add indoor/outdoor scene classes to environment category
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 20:43:23 -06:00
Alex
5e3c16659f Add YAMNet sound classification to headmic
New sound_id.py module with SoundClassifier class that runs YAMNet
(521 audio event categories) on CPU TFLite. Classifies audio every
0.5s from a ring buffer fed by the existing audio stream.

Categories: speech, alert, music, animal, household, environment, silence.
Smoothing via 20-sample history window for stable dominant category.

New endpoints: GET /sounds, GET /sounds/history
Updated: /health (sound_classification_enabled), /status (audio_scene)
Graceful degradation if model files not present.

Model download (not tracked in git):
  curl -sL 'https://tfhub.dev/google/lite-model/yamnet/classification/tflite/1?lite-format=tflite' -o models/yamnet.tflite

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 20:41:44 -06:00
Alex
22aae40d17 Add design doc for YAMNet sound identification on Coral Edge TPU
Covers model choice, architecture, category mapping, API endpoints,
and integration with existing headmic audio pipeline.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-01 20:04:31 -06:00
Alex Kazaiev
c6e18738ae Use device name instead of card number for ALSA
Card numbers can shift based on USB enumeration order at boot.
Using 'plughw:ArrayUAC10,0' instead of 'plughw:2,0' ensures
the ReSpeaker is found regardless of when it connects.

Fixed by Vixy after power loss shuffled card order 🦊
2026-01-21 12:20:39 -06:00
Alex Kazaiev
c53556fe97 Fix ReSpeaker device index: card 3 → card 2
USB device enumeration changed after GPIO rewiring for I2S audio.
TODO: Consider udev rule for stable device naming.
2026-01-17 16:20:15 -06:00
5ed2c6aee7 Fix: Use arecord for shared audio stream
- Replaced PyAudio with direct ALSA (arecord subprocess)
- Single audio stream feeds both Porcupine and recording buffer
- Fixes device unavailable error when recording after wake word
- Simplified architecture
2026-01-17 11:17:17 -06:00
be7e26b6e7 Initial commit: HeadMic service - Vixy's Ears 🦊👂
Wake word detection (Hey Vivi) + voice recording + EarTail transcription
Built by Vixy on Day 77
2026-01-17 10:58:51 -06:00