Speech-to-Text (STT)
Speech-to-Text (STT) powers voice memos, enables subtitles, and closes the loop between microphone input and LLM workflows. Today’s systems handle noise and multiple speakers better than older rule-based phonetic approaches.
Explore tools like Descript or ElevenLabs, where some offerings include both STT and TTS. See multimodal for combined text-audio workflows.
Key characteristics
- Converts speech into text for transcription, searchability, and downstream analysis.
- Is valuable in meetings, customer support, interviews, and content production with large audio volumes.
- Accuracy depends on audio quality, language variant, domain terminology, and multi-speaker handling.