Text to Speech Previewer
Preview text with browser speech synthesis at different speeds, pitches and voices with play, pause and stop controls
Listen to text read aloud with adjustable voice, speed, and pitch.
How the Text to Speech Previewer Works
Type or paste up to a few thousand characters, pick a voice from the dropdown, slide the speed and pitch controls to taste, and click Play. The browser's built-in Web Speech API does the actual synthesis, which means there is no server round-trip, no API key, and no usage cap. Speed runs from 0.5x to 2x in 0.1 increments. Pitch runs from 0.5 (deeper) to 2 (higher) on the same scale.
The voice list comes from your operating system, not the tool. Chrome on Windows offers Microsoft David, Mark, Zira and a clutch of regional voices. Safari on macOS exposes the full Apple voice catalogue including Daniel (UK), Karen (Australia) and Moira (Ireland). Firefox tends to expose fewer options. The Default badge shows which voice the system uses if you do not choose one explicitly.
Voice Quality Differs Between Engines
An accessibility consultant previewing a 200-word announcement will get noticeably different output between Chrome's Microsoft Mark and Safari's Daniel, even though both are listed as 'English (United Kingdom)'. Microsoft voices on modern Windows tend to be neural and natural-sounding, while older system voices on Linux or earlier Windows builds can sound robotic. There is no way to standardise this from JavaScript - the tool only controls speed and pitch.
Practical scenario: a podcaster previewing show notes before recording uses 1.0x speed and default pitch to hear how the words land. A teacher checking pronunciation of unfamiliar words uses 0.8x speed to slow individual syllables. A copywriter testing whether an Instagram caption sounds natural reads it at 1.2x to match how people actually scan social posts. Pause and Resume let you stop mid-sentence; Stop fully cancels and resets, useful for switching voice or re-running with different settings.
Frequently Asked Questions
Why is the voice list empty when I first open the tool?
Voice loading is asynchronous in most browsers. Chrome, in particular, fires the voiceschanged event a fraction of a second after page load. The 'Loading voices...' placeholder waits for that event and replaces itself with the dropdown once the list is ready. If voices never appear, your browser may have speech synthesis disabled, or you are on a platform without TTS engines installed.
Can I download the audio as an MP3?
Not from the Web Speech API directly. It plays through your speakers but does not expose a recordable audio stream. To capture the output, use your operating system's audio recorder pointed at the system audio device, or screen-record with audio. For pure file output, dedicated TTS APIs (ElevenLabs, OpenAI TTS, Amazon Polly) are the right tool.
How long can the input text be?
There is no formal cap, but most browsers stop or stutter around 32,000 characters in a single utterance. For long documents, split into paragraphs and play sections individually. Reading speed at 1.0x is roughly 250 words per minute, so a 1,000-word piece takes about four minutes to read aloud.
Does it work offline?
Yes, once the page has loaded. The synthesis runs entirely on your device using whatever TTS engines your operating system provides. This is why voices differ between Windows, macOS, Linux and mobile - each platform ships its own engines and the API exposes whatever is locally installed.