From fde3b985542ff444c95bb3362583d4104c12251c Mon Sep 17 00:00:00 2001
From: Alex <akazaev@proton.me>
Date: Sun, 12 Apr 2026 22:01:49 -0500
Subject: [PATCH] Document anonymous speaker tracking + promote workflow

Added speaker identification section explaining the three-tier system
(enrolled/anonymous/unidentified), the promote workflow, and enrollment
options. Updated speakers API table with /speakers/promote endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 README.md | 34 ++++++++++++++++++++++++++++++++--
 1 file changed, 32 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index a0b22e8..857442e 100644
--- a/README.md
+++ b/README.md
@@ -48,7 +48,7 @@ Binaural hearing service for Vixy's physical head. Dual mic arrays with spatial
 |---------|--------|----------|--------|
 | Wake word detection | Porcupine | CPU | Needs Picovoice key |
 | Sound classification | sound_id.py | Coral Edge TPU | 521 classes, ~2ms |
-| Speaker identification | speaker_id.py | CPU (Resemblyzer) | Enrollment via API |
+| Speaker identification | speaker_id.py | CPU (Resemblyzer) | Enrolled + anonymous tracking |
 | Spatial tracking | spatial.py | USB control | 3-signal fusion: DoA + ILD + ITD |
 | Distance estimation | spatial.py | audio energy | Proximity zones (intimate/conversational/across_room/far) |
 | ITD processing | spatial.py | audio cross-correlation | Sub-ms delay → bearing angle |
@@ -196,9 +196,10 @@ sudo systemctl start headmic
 
 | Endpoint | Method | Description |
 |----------|--------|-------------|
-| `/speakers` | GET | List enrolled speakers |
+| `/speakers` | GET | List all speakers (enrolled + anonymous) |
 | `/speakers/enroll` | POST | Enroll from uploaded audio (multipart: name + WAV) |
 | `/speakers/enroll-from-mic` | POST | Record 5s from mic + enroll (query: name) |
+| `/speakers/promote` | POST | Promote anonymous → enrolled (query: anon_id, name) |
 | `/speakers/{name}` | DELETE | Remove a speaker |
 
 ### Recording
@@ -232,6 +233,35 @@ sudo systemctl start headmic
 }
 ```
 
+## Speaker Identification
+
+Three-tier recognition using Resemblyzer 256-dim GE2E embeddings:
+
+| Tier | Name format | How it works |
+|------|-------------|-------------|
+| Enrolled | `"Alex"` | Matched against stored embeddings (cosine ≥ 0.75) |
+| Anonymous | `"unknown_bfa1"` | Clustered online from unrecognized voices (cosine ≥ 0.70) |
+| Unidentified | `null` | Audio too short or no speech detected |
+
+Anonymous speakers get a stable 4-character hex ID derived from their voice embedding. The same person consistently gets the same ID across observations. IDs expire after 1 hour of silence, max 10 tracked simultaneously.
+
+**Workflow:**
+```
+Unknown person speaks → "unknown_bfa1" (auto-created)
+    ↓
+You ask "who's that?" → check /speakers
+    ↓
+curl -X POST "http://head:8446/speakers/promote?anon_id=unknown_bfa1&name=Bob"
+    ↓
+Now recognized as "Bob" going forward (embedding saved to voices.db)
+```
+
+Alternatively, enroll directly from mic:
+```bash
+curl -X POST "http://head:8446/speakers/enroll-from-mic?name=Alex"
+# Speak for 5 seconds
+```
+
 ## LED States
 
 | State | Effect | Color |