NEW! Voice Cloning now available in 37 Languages
View All

SSML Generator & Editor

Build SSML markup visually. Add pauses, emphasis, prosody controls, and phonetic pronunciation. Copy or download ready-to-use SSML.

📝 Text

Generated SSML

<speak>
  Hello, welcome to our service.
</speak>

SSML Quick Reference

⏸️ break — Add pauses💪 emphasis — Stress words🎵 prosody — Rate/pitch/volume🔢 say-as — Numbers/dates/etc🔄 sub — Pronunciation alias🗣️ phoneme — IPA pronunciation

How to Use the SSML Generator

  1. Click the element buttons to add SSML nodes (text, pause, emphasis, etc.)
  2. Edit the content and attributes for each node
  3. Reorder nodes using the arrow buttons
  4. Copy the generated SSML or download as a .ssml file

SSML Elements Explained

The <break> element adds pauses of specific durations. <emphasis> makes words stand out with reduced, moderate, or strong stress. <prosody> gives you control over speaking rate, pitch, and volume. <say-as> tells the engine how to interpret content like numbers, dates, or phone numbers. <sub> provides pronunciation aliases, and <phoneme> specifies exact IPA pronunciation.

Frequently Asked Questions

What is SSML?

SSML (Speech Synthesis Markup Language) is an XML-based markup language that gives you fine-grained control over how text-to-speech engines pronounce your text. It lets you add pauses, change speaking rate, adjust pitch, spell out numbers, and more.

What SSML elements does this editor support?

The editor supports the most commonly used SSML elements: text, break (pauses), emphasis, prosody (rate/pitch/volume), say-as (number/date interpretation), sub (pronunciation substitution), and phoneme (IPA pronunciation).

Which TTS engines support SSML?

Most major TTS engines support SSML, including Amazon Polly, Google Cloud TTS, Microsoft Azure TTS, IBM Watson TTS, and Verbatik. The exact supported elements may vary by engine.

Can I download the generated SSML?

Yes. You can copy the SSML to your clipboard or download it as a .ssml file, ready to use with any compatible TTS engine.

What does the prosody element do?

The prosody element lets you control three aspects of speech: rate (how fast), pitch (how high or low), and volume (how loud). You can set each to predefined levels like slow, fast, high, low, etc.

What is the phoneme element for?

The phoneme element lets you specify exact pronunciation using IPA (International Phonetic Alphabet) notation. This is useful for proper nouns, technical terms, or words that TTS engines commonly mispronounce.

Need more power?

Try Verbatik's AI Voice Tools

Generate natural-sounding voiceovers, clone voices, and more with our AI-powered platform.