Add cocktail party spatial filtering (#7)

audio_stream.py: Added focus_side property. When set, the stream
yields from the focused side regardless of energy (attention lock).
When None, falls back to energy-based auto selection.

multi_speaker.py: When beams lock onto 2 speakers, sets audio focus
to the target speaker's side. Auto-switches target when the current
target goes silent and the other starts talking. Manual focus via API.

headmic.py: New endpoint POST /speakers/focus?speaker=0|1 to manually
switch attention. /speakers/tracked now shows is_target, target_speaker,
and audio_focus fields.

The cocktail party effect: when 2 people are talking, the audio feed
to Porcupine/VAD/transcription comes from the target speaker's direction,
suppressing the other. XVF3800 beam gating silences the non-speaking beam,
and audio_stream focus locks the ear facing the target.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Alex
2026-04-12 21:47:30 -05:00
parent 38d21ef53c
commit 0705b3818b
3 changed files with 64 additions and 8 deletions

View File

@@ -117,6 +117,7 @@ class DualAudioStream:
self.left = MicStream("left", left_device)
self.right = MicStream("right", right_device) if right_device else None
self.active_side: str = "left"
self.focus_side: Optional[str] = None # None=auto (energy), "left"/"right"=locked attention
self._running = False
def start(self):
@@ -162,18 +163,25 @@ class DualAudioStream:
last_frame_left = frame_left
last_frame_right = frame_right
# Pick best beam
# Pick beam: focused attention overrides energy-based selection
if frame_right is None:
self.active_side = "left"
yield frame_left, "left"
elif self.focus_side:
# Cocktail party mode: locked onto a specific side
self.active_side = self.focus_side
if self.focus_side == "right" and frame_right:
yield frame_right, "right"
else:
yield frame_left, "left"
else:
# Auto mode: pick higher-energy side
left_energy = self.left.get_energy()
right_energy = self.right.get_energy()
if right_energy > left_energy * 1.1: # 10% hysteresis
if right_energy > left_energy * 1.1:
self.active_side = "right"
elif left_energy > right_energy * 1.1:
self.active_side = "left"
# else: keep current active_side (hysteresis prevents flapping)
if self.active_side == "right" and frame_right:
yield frame_right, "right"