Rate Limits

Rate limits control the number of API requests a user can make per second. This prevents abuse and ensures our services remain responsive and available.

At AI Ark, we use the term concurrency interchangeably with rate limits. It's important to note that while they share similarities, they are not identical to the concept of concurrency in programming. Concurrency, in this context, refers to a rate limit per second implemented through our API gateway.

Each service has its own default concurrency (rate limit), ensuring tailored performance and reliability. These rate limits are customizable based on user needs and usage patterns.

Default Concurrency

Every API service under AI Ark has a different default concurrency limit. These limits are set to balance performance and availability. You can find the specific concurrency limits for each service in the service quotas under billing section of the developer portal.

Important Notes on Usage

Rate Limits: If you exceed your allocated quota, API calls will return a 429 Too Many Requests error.
Rate Limit Reset: Your rate limits reset every 60 seconds.
Higher Limits: Need more capacity? Contact our team to discuss upgrading your plan.