ScrAIbe: Research-Grade Transcription & Diarisation

ScrAIbe began as an internal research tool for turning lab interviews, field recordings, and technical briefings into searchable, citable text. Standard speech-to-text services struggled with noisy environments, German/English code-switching, and overlapping speakers, so I built a pipeline tailored to research-grade audio.

ScrAIbe is a modular, multilingual transcription and speaker pipeline:

Whisper-based ASR for high-accuracy transcription and optional translation of segments.
Speaker diarisation + recognition via Pyannote, with VoxCeleb embeddings for robust speaker separation.
Automatic language identification using VoxLingua to handle mixed-language recordings cleanly.
Multiple entry points: a Python API for full control, a CLI for batch jobs, and an optional lightweight Gradio app for quick local runs.
Server-friendly deployment through Docker when you want consistent lab/on-prem setups.

ScrAIbe is open source because research infrastructure shouldn’t be a black box. If you want a fully no-code experience for teams, the companion project ScrAIbe-WebUI wraps this backend into an easy Docker-deployable web service.