feat: filter short audio segments (mic bumps) and add debug notebook
Mic bumps produce transient spikes that pass VAD onset detection but contain no real speech — the model hallucinates "thank you" from them. Added MIN_SPEECH_SECONDS (0.3s) filter to discard segments where the actual speech portion is too short. Added a Jupyter notebook (notebooks/audio_debug.ipynb) for real-time audio visualization: streams RMS + peak amplitude into a live Plotly FigureWidget, then provides post-hoc waveform inspection, segment playback, and side-by-side segment comparison. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -27,3 +27,11 @@ build-backend = "hatchling.build"
|
||||
|
||||
[tool.hatch.build.targets.wheel]
|
||||
packages = ["src/cohere_transcribe"]
|
||||
|
||||
[dependency-groups]
|
||||
dev = [
|
||||
"anywidget>=0.11.0",
|
||||
"ipywidgets>=8.1.8",
|
||||
"jupyterlab>=4.5.7",
|
||||
"plotly>=6.7.0",
|
||||
]
|
||||
|
||||
Reference in New Issue
Block a user