cohere-transcribe

Author	SHA1	Message	Date
tomatocream	50f8d158c4	feat: add voice command processing and input backend interface Introduce InputBackend protocol with WtypeBackend and PrintBackend, and a command processor that translates spoken commands (enter, new line, question mark, comma, etc.) into key presses and punctuation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-30 21:37:20 +08:00
tomatocream	f083e424c9	feat: make silence pause duration configurable via --pause flag Default is 0.3s for responsive typing. Configurable on both `cohere on --pause` and `cohere transcribe --stream --pause`. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-30 21:12:26 +08:00
tomatocream	92d8ba28d0	feat: add Typer CLI with daemon mode and wtype keyboard injection Replace argparse CLI with Typer-based CLI supporting `cohere on/off/status` commands. The daemon runs transcription in the background and types into the focused Wayland window via wtype. Adds wtype to flake.nix and fixes the hatchling build backend. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-30 21:09:32 +08:00
tomatocream	8d517b3ea8	refactor: restructure project into src layout with proper packaging Split monolithic transcribe.py into focused modules under src/cohere_transcribe/ (model, vad, stream, cli), move tests into tests/, add hatchling build system and CLI entry point, remove unused shell.nix and main.py. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-30 00:45:56 +08:00
tomatocream	cbea62b2a9	fix: add portaudio to LD_LIBRARY_PATH and add flake lockfile Move LD_LIBRARY_PATH out of env block and include portaudio so audio devices are discoverable at runtime. Add flake.lock and a quick microphone test script. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-30 00:42:36 +08:00
tomatocream	843ec534d1	fix: handle processor.decode returning a list of strings Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-29 03:00:09 +08:00
tomatocream	cf18335235	fix: simplify audio callback, use deque for pre-roll, add worker timeout warning - Remove frame_buf accumulation: blocksize=FRAME_SIZE guarantees indata is exactly FRAME_SIZE samples, so buffering was unnecessary. Use indata[:, 0].copy() to avoid stale references from sounddevice's buffer reuse. - Replace pre_roll list with collections.deque(maxlen=PRE_ROLL_FRAMES) to eliminate manual bounds-checking (pop(0)) on every frame. - Warn to stderr if the transcription worker thread outlives its 30s join timeout. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 02:48:51 +08:00
tomatocream	747a4772b6	feat: implement live streaming transcription with VAD Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 02:46:13 +08:00
tomatocream	d62fcdd1cd	feat: add silence calibration and VAD state machine Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 02:45:09 +08:00
tomatocream	4605be5bc9	refactor: switch to argparse, add --stream and --lang flags Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-29 02:43:47 +08:00
tomatocream	6bff2875c5	Add implementation plan for live streaming transcription Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-29 02:42:00 +08:00
tomatocream	e0911653fe	Add design spec for live streaming microphone transcription Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-05-29 02:38:05 +08:00
tomatocream	c055a8ffb9	Replace flake.nix with shell.nix for simpler NixOS dev environment	2026-05-26 01:59:15 +08:00
tomatocream	55a51a7668	Add flake.nix with portaudio + CUDA, microphone support in transcribe.py	2026-05-26 01:55:54 +08:00
tomatocream	8b88489a53	Simplify to audio file input (mic requires PortAudio on NixOS)	2026-05-26 01:49:52 +08:00
tomatocream	14abcb89f2	Add accelerate dependency	2026-05-26 01:38:10 +08:00
tomatocream	82fe21fe41	Add Cohere Transcribe demo with uv + Python 3.14	2026-05-26 01:35:10 +08:00

17 Commits