Text to speech · Voice cloning · Multilingual · Speed/pitch control · MP3 / WAV export

Seed Audio 1.0 — AI Voice Generation

Generate natural, expressive AI speech from text or reference audio. Seed Audio 1.0 supports 20 preset voices, voice cloning, and speed/pitch control.

Generate now View pricing

Natural speech from text

Type your script and get expressive AI speech in seconds.

Voice cloning

Reproduce a voice from up to 3 reference clips (30s each).

Multilingual

Supports many languages, including Japanese.

Text to speech

Text

0 characters / 350

Output format

Sample rate

Speed

Pitch

Volume

Cost: 50 credits

Select a voice

Generated audio

No audio generated yet.

Loading generator...

Seed Audio 1.0 voice samples

Examples generated with Seed Audio 1.0 — listen to the real voices right here.

Podcast

「今日のテーマは、AI 音声の活用法です。台本を貼り付けるだけで、自然な会話のような読み上げが作れます。」

0:00 / 0:06

Voice cloning

「ブランドの声を、すべてのコンテンツで統一できます。一度設定すれば、同じトーンで何度でも生成できます。」

0:00 / 0:08

Multilingual

「Seed Audio は日本語と English など、複数の言語に対応しています。」

0:00 / 0:05

Natural delivery

「わぁ、すごい！こんなに自然な音声が、本当に数秒でできるなんて驚きです。」

0:00 / 0:06

Features

What Seed Audio 1.0 can do

From text to natural speech. Text-to-speech, voice cloning and multilingual output in one generator, with flexible controls for any use case.

Text to speech (TTS)

Turn any text into natural, expressive speech. Describe the tone in plain language to shape the delivery.

Voice cloning

Reproduce a voice from reference audio (up to 3 clips, 30s each). Keep one brand voice across all content.

20 preset voices

Pick from narration, character or news voices to match your tone and use case.

Multilingual

English, Chinese, Japanese, Spanish, Indonesian and Portuguese, including mixed-language text.

Speed, pitch & volume

Fine-tune speed (0.5–2.0x), pitch (±12 semitones) and volume.

MP3 / WAV export

Export as MP3, WAV, PCM or OGG Opus, up to 48 kHz.

Use cases

What you can build with Seed Audio

For any workflow that needs a voice. Creators, developers and teams ship audio fast — without a recording studio.

Video voiceovers

Narrate YouTube videos, ads and explainers with a consistent voice. Regenerate a line in seconds when the script changes.

Podcasts & audiobooks

Turn long scripts into clear narration that keeps tone and pacing steady across chapters.

News & announcements

Pick a steady voice and speed for a clear, broadcast-style read.

Voice cloning

Build a reusable brand voice from a short sample and keep it consistent across videos, courses and social.

Multilingual content

Speak 6 languages, so videos and courses for global audiences use the same workflow.

Apps & games

Add speech to assistants, menus and game dialogue, and more.

How it works

Generate voice in 3 steps

Enter text and generate AI speech right away — no complex setup.

01
Enter your text
Type the script you want to voice. Add a mood like “calm tone” to shape the delivery.
02
Choose voice & settings
Pick from 20 preset voices and adjust speed, pitch and volume. You can also clone a voice.
03
Generate & export
Generate in seconds, preview, and export as MP3 / WAV. Regenerate any line to fine-tune.

Generate now

Pricing

Simple credit-based pricing

Try free, then spend credits based on usage. Each generation shows its cost upfront, and failed jobs are not charged.

Free tier

Try it free

New accounts get free credits — start generating right away.

Pay as you goSuggested

Only pay for what you use

Credits scale with the length of the audio, and the cost is always shown before you generate.

No risk

Failed jobs are free

You're never charged for a failed job. Check the terms for commercial use.

Answers

Frequently asked questions

Answers about Seed Audio 1.0.

An audio generation model by ByteDance. It generates natural speech from text or reference audio, with text-to-speech (TTS) and voice cloning, usable right in your browser here.

Yes. It supports English, Chinese, Japanese, Spanish, Indonesian and Portuguese, generating natural speech from Japanese text.

Yes — from reference audio (up to 3 clips, 30 seconds each), referenced in the prompt as @Audio1–@Audio3.

New accounts get free credits to try your first generations, then you spend credits based on usage.

Please review your intended use and the applicable terms. See the pricing page and FAQ for details.

MP3, WAV, PCM or OGG Opus, with sample rates up to 48 kHz.

Generate AI voice now

Create natural speech from text with Seed Audio 1.0. Your first generation is covered by free credits.

Generate voice

Seed Audio 1.0 — AI Voice Generation