Speech-to-Text (STT)

Speech-to-Text (STT) powers voice memos, enables subtitles, and closes the loop between microphone input and LLM workflows. Today’s systems handle noise and multiple speakers better than older rule-based phonetic approaches.

Explore tools like Descript or ElevenLabs, where some offerings include both STT and TTS. See multimodal for combined text-audio workflows.