Text to speech · Voice cloning · Multilingual · Speed/pitch control · MP3 / WAV export

Seed Audio 1.0 — AI Voice Generation

Generate natural, expressive AI speech from text or reference audio. Seed Audio 1.0 supports 20 preset voices, voice cloning, and speed/pitch control.

Natural speech from text

Type your script and get expressive AI speech in seconds.

Voice cloning

Reproduce a voice from up to 3 reference clips (30s each).

Multilingual

Supports many languages, including Japanese.

Loading generator...

Seed Audio 1.0 voice samples

Examples generated with Seed Audio 1.0 — listen to the real voices right here.

Podcast

今日のテーマは、AI 音声の活用法です。台本を貼り付けるだけで、自然な会話のような読み上げが作れます。

0:00 / 0:06
Voice cloning

ブランドの声を、すべてのコンテンツで統一できます。一度設定すれば、同じトーンで何度でも生成できます。

0:00 / 0:08
Multilingual

Seed Audio は日本語と English など、複数の言語に対応しています。

0:00 / 0:05
Natural delivery

わぁ、すごい!こんなに自然な音声が、本当に数秒でできるなんて驚きです。

0:00 / 0:06
How it works

Generate voice in 3 steps

Enter text and generate AI speech right away — no complex setup.

  1. 01

    Enter your text

    Type the script you want to voice. Add a mood like “calm tone” to shape the delivery.

  2. 02

    Choose voice & settings

    Pick from 20 preset voices and adjust speed, pitch and volume. You can also clone a voice.

  3. 03

    Generate & export

    Generate in seconds, preview, and export as MP3 / WAV. Regenerate any line to fine-tune.

Pricing

Simple credit-based pricing

Try free, then spend credits based on usage. Each generation shows its cost upfront, and failed jobs are not charged.

Free tier

Try it free

New accounts get free credits — start generating right away.

Pay as you goSuggested

Only pay for what you use

Credits scale with the length of the audio, and the cost is always shown before you generate.

No risk

Failed jobs are free

You're never charged for a failed job. Check the terms for commercial use.

Answers

Frequently asked questions

Answers about Seed Audio 1.0.

An audio generation model by ByteDance. It generates natural speech from text or reference audio, with text-to-speech (TTS) and voice cloning, usable right in your browser here.

Yes. It supports English, Chinese, Japanese, Spanish, Indonesian and Portuguese, generating natural speech from Japanese text.

Yes — from reference audio (up to 3 clips, 30 seconds each), referenced in the prompt as @Audio1–@Audio3.

New accounts get free credits to try your first generations, then you spend credits based on usage.

Please review your intended use and the applicable terms. See the pricing page and FAQ for details.

MP3, WAV, PCM or OGG Opus, with sample rates up to 48 kHz.

Generate AI voice now

Create natural speech from text with Seed Audio 1.0. Your first generation is covered by free credits.