Documentation Index
Fetch the complete documentation index at: https://docs.soundpiece.co/llms.txt
Use this file to discover all available pages before exploring further.
Why a per-account rate limit?
Rate limits keep the API responsive for every customer. Without them a single misbehaving client — a tight loop accidentally hammering an endpoint — would degrade latency for everyone. The limit you’ll encounter is per account: every key on the same account shares the same budget. If you need higher throughput, get in touch via business@soundpiece.co.uk. The only429 you should plan to handle is this one — the request rate limit. We manage processing capacity for you separately, so you won’t see 429s from us when our backend is busy — your operation just stays in processing longer until capacity frees up. See Async operations for that lifecycle.
The 429 response
When you exceed your rate limit you get:Retry-After header tells you how many seconds to wait before retrying. Honour it — backing off sooner will just bounce off the limit again.
We don’t currently return X-RateLimit-* headers on successful responses. If your integration needs proactive throttling on your side (rather than reactive on 429), let us know via business@soundpiece.co.uk.
Handling 429 in code
Always readRetry-After rather than using a fixed delay. Combine with exponential backoff as a fallback if the header is absent or set to 0:
PUT endpoints are idempotent on idempotency_key, retrying after a 429 is safe — you’ll either get the same operation you tried to create the first time, or a fresh one. You won’t double-create work. See Async operations for the idempotency contract.
Why no 429 for capacity?
Audio generation has a long tail of processing time — second to a minute for typical jobs. If we returned 429 every time our processing tier was busy, you’d have to retry repeatedly until capacity opened up, which is wasteful and fragile.
Instead we queue work internally. When you PUT, we accept the operation immediately and return status: "processing". Behind the scenes we dispatch when capacity is available. Your polling continues to show processing until the job lands — no 429, no manual retry of the submit.
This means the only places you should plan for 429 are submit time (request rate exceeded) — never during long-running operations.
Designing for sustained throughput
If you’re driving high volume:- Use webhooks instead of polling. Polling a hot operation every second consumes the same per-account rate budget your submits are using. See webhooks.
- Implement a queue on your side when submitting in bursts. A token-bucket or leaky-bucket sized to your rate limit keeps the submit channel smooth.
- Use one key per service so you can revoke independently, but remember they share the account’s rate budget.
