Question 1

What is Seed Audio 1.0?

Accepted Answer

An audio generation model by ByteDance that creates natural speech from text or reference audio, with text-to-speech and voice cloning.

Question 2

Which languages are supported?

Accepted Answer

English, Chinese, Japanese, Spanish, Indonesian and Portuguese.

Question 3

How many voices are there?

Accepted Answer

20 preset voices, plus adjustable speed, pitch and volume.

Question 4

What are the voice cloning limits?

Accepted Answer

Up to 3 reference clips, each 30 seconds or less, referenced in the prompt as @Audio1–@Audio3.

Question 5

Which formats can I export?

Accepted Answer

MP3, WAV, PCM or OGG Opus, with sample rates up to 48 kHz.

Question 6

How are credits used?

Accepted Answer

Based on the length and settings of each generation. The cost is shown before you run it, and failed jobs are not charged.

AI voice models

Models