Speech & Music
Synchronous TTS bytes and asynchronous music jobs.
POST /v1/audio/speech
Synchronous, OpenAI-compatible. The server submits to MiniMax, polls until ready, then streams the audio bytes back with chunked transfer encoding.
from openai import OpenAI
c = OpenAI(api_key="local-key", base_url="http://localhost:8080/v1")
audio = c.audio.speech.create(
model="speech-2.8-hd",
voice="Charming_Lady", # MiniMax voice ID
input="Hello from cli2api.",
response_format="mp3", # mp3 / pcm / flac / wav / opus / aac
)
audio.stream_to_file("hello.mp3")Supported model: speech-2.8-hd, speech-2.8-turbo. Fields: input (or
prompt), voice (or voice_id), response_format, speed, vol, pitch,
emotion, language_boost, sample_rate, bitrate, english_normalization.
POST /v1/audio/music + GET /v1/audio/music/{id}
Music takes 30s–several minutes, so it's an async job like video.
import time, httpx
base = "http://localhost:8080"
job = httpx.post(f"{base}/v1/audio/music", json={
"model": "music-2.5",
"prompt": "upbeat synthwave, melodic",
"lyrics_prompt": "[verse]\nlight the night",
}).json()
job_id = job["id"] # e.g. "audio_..."
while True:
r = httpx.get(f"{base}/v1/audio/music/{job_id}").json()
if r["status"] in ("completed", "failed"):
break
time.sleep(8)
print(r.get("audios") or r.get("error"))Supported model: music-2.0, music-2.5. Fields: prompt (style),
lyrics_prompt, lyrics_optimizer, audio_format, sample_rate, bitrate.
If MuleRun's upstream MiniMax service is busy you may get a failed status with
a vendor_error (e.g. code 3005). That's an upstream issue surfaced as a
structured error — retry or try again later.