Key capabilities
- Streaming transcription: Transcribes live audio as it arrives, without waiting for the utterance to complete.
- Low latency: Designed for real-time scenarios where delays aren’t acceptable, such as live captions or quality monitoring.
- Parallel operation: Runs alongside other realtime models (such as GPT Realtime Translate) to provide source-language transcription in parallel with translation.
When to use GPT Realtime Whisper
Use GPT Realtime Whisper when you need:- Live captions and subtitles for ongoing audio streams.
- Transcription for monitoring, moderation, or analytics workflows.
- Original-language speech captured alongside live translation experiences.
- Text visibility into spoken input while other models process the audio.
Example use cases
- Live event captioning: Provide real-time captions in the speaker’s original language during conferences, webinars, or broadcasts.
- Compliance and quality review: Capture the original conversation as text for regulatory compliance, quality assurance, or analytics.
- Multilingual pipelines: Pair with GPT Realtime Translate to deliver both translated output and a source-language transcript in a single workflow.