How to Use Seed Audio 1.0

Enter text and generate natural AI speech in seconds. This guide walks through everything from creating an account to writing prompts, cloning a voice, and exporting your audio with Seed Audio 1.0.

Generate audio in 5 steps

  1. 1

    Create an account and get free credits

    Sign up with email or social login and you receive free credits. They are enough to try your first Seed Audio 1.0 generations. You can check your remaining credit balance before every generation.

  2. 2

    Enter the text you want spoken

    Paste the script or sentence you want read aloud into the editor. Seed Audio speaks English, Chinese, Japanese, Spanish, Indonesian and Portuguese. Add a description of the tone you want — calm, upbeat — to shape the delivery.

  3. 3

    Pick a preset voice or clone your own

    Choose from 20 preset voices to match your use case. To use your own or a brand voice, upload reference audio to clone it. You can provide up to 3 reference clips (each 30 seconds or less), referenced in the prompt as @Audio1, @Audio2 and @Audio3.

  4. 4

    Adjust speed, pitch, volume and format

    Tune speaking speed (0.5–2.0x), pitch (±12 semitones) and volume (0.5–2.0x). Export as MP3, WAV, PCM or OGG Opus, with sample rates up to 48 kHz. Set it to fit your project and drop the audio straight into a video or podcast.

  5. 5

    Generate, preview and export

    Press generate and Seed Audio turns your text into audio. Preview it in place, then download as MP3 or WAV. To fine-tune any line, just change the text or settings and regenerate.

Tips and how to write prompts

Write prompts in plain language

Seed Audio 1.0 prompts work in plain sentences — no special syntax to memorize. Alongside the line you want spoken, describe the scene or emotion (for example, in a calm tone, sounding cheerful) to change the delivery.

Describe the scene and mood

Describing a scene like “a late-night convenience store, suspenseful” nudges the tone and pacing toward that mood. Adding the use case — narration, ad, drama — in a few words also helps.

Use punctuation and line breaks

Commas, periods and line breaks create natural pauses and make speech easier to follow. Splitting long passages keeps the pacing steady.

Match the voice to the use case

Pick a preset voice and speed that fit narration, character, or news reading. The same script sounds very different across voices, so preview and compare.

Clone with clean reference audio

Voice cloning reproduces a speaker best from clean, low-noise reference audio. Each clip can be up to 30 seconds, with a maximum of 3 clips, referenced in the prompt as @Audio1. Note that a reference image and reference audio cannot be used together.

Fine-tune speed, pitch and volume

Speed ranges 0.5–2.0x, pitch ±12 semitones, and volume 0.5–2.0x. If delivery is too fast, lower the speed; to make a voice lower and calmer, drop the pitch.

Works across languages

Seed Audio supports English, Chinese, Japanese, Spanish, Indonesian and Portuguese, and can read text that mixes languages.

If it does not read as expected

If the delivery is off, try splitting the text into shorter parts, adding punctuation, or choosing a different voice. As a new model, a generation can occasionally fail — simply generate again and it often resolves.

Available model

Every generation on this site uses Seed Audio 1.0, which supports text to speech and voice cloning.

How-to FAQ

It's simple — try it now

Open the generator and turn text into AI speech. Your first generation is covered by free credits.